Knowledge

Regular expression

Source 📝

4307:. Like old typewriters, plain base characters (white spaces, punctuation characters, symbols, digits, or letters) can be followed by one or more non-spacing symbols (usually diacritics, like accent marks modifying letters) to form a single printable character; but Unicode also provides a limited set of precomposed characters, i.e. characters that already include one or more combining characters. A sequence of a base character + combining characters should be matched with the identical single precomposed character (only some of these combining sequences can be precomposed into a single Unicode character, but infinitely many other combining sequences are possible in Unicode, and needed for various languages, using one or more combining characters after an initial base character; these combining sequences 4514: 619: 505:, 'b' is a literal character that matches just 'b', while '.' is a metacharacter that matches every character except a newline. Therefore, this regex matches, for example, 'b%', or 'bx', or 'b5'. Together, metacharacters and literal characters can be used to identify text of a given pattern or process a number of instances of it. Pattern matches may vary from a precise equality to a very general similarity, as controlled by the metacharacters. For example, 9083: 4013:). The explicit approach is called the DFA algorithm and the implicit approach the NFA algorithm. Adding caching to the NFA algorithm is often called the "lazy DFA" algorithm, or just the DFA algorithm without making a distinction. These algorithms are fast, but using them for recalling grouped subexpressions, lazy quantification, and similar features is tricky. Modern implementations include the re1- 3924:, used since at least 1970, as well as some more sophisticated extensions like lookaround that appeared in 1994. Lookarounds define the surrounding of a match and do not spill into the match itself, a feature only relevant for the use case of string searching. Some of them can be simulated in a regular language by treating the surroundings as a part of the language as well. 197: 3821:"Regular expressions" are only marginally related to real regular expressions. Nevertheless, the term has grown with the capabilities of our pattern matching engines, so I'm not going to try to fight linguistic necessity here. I will, however, generally call them "regexes" (or "regexen", when I'm in an Anglo-Saxon mood). 4583:, processing them across the entire database could consume excessive computer resources depending on the complexity and design of the regex. Although in many cases system administrators can run regex-based queries internally, most search engines do not offer regex support to the public. Notable exceptions include 4056:
kind of backtracking. Some implementations try to provide the best of both algorithms by first running a fast DFA algorithm, and revert to a potentially slower backtracking algorithm only when a backreference is encountered during the match. GNU grep (and the underlying gnulib DFA) uses such a strategy.
4261:, do not allow character ranges to cross Unicode blocks. A range like is valid since both endpoints fall within the Basic Latin block, as is since both endpoints fall within the Armenian block, but a range like is invalid since it includes multiple Unicode blocks. Other engines, such as that of the 3595:
support multiple regex flavors. Perl-derivative regex implementations are not identical and usually implement a subset of features found in Perl 5.0, released in 1994. Perl sometimes does incorporate features initially found in other languages. For example, Perl 5.10 implements syntactic extensions
1738:
or not); a logical OR character, which offers a set of alternatives, and a logical NOT character, which negates an atom's existence; and backreferences to refer to previous atoms of a completing pattern of atoms. A match is made, not when all the atoms of the string are matched, but rather when all
4055:
Although backtracking implementations only give an exponential guarantee in the worst case, they provide much greater flexibility and expressive power. For example, any implementation which allows the use of backreferences, or implements the various extensions introduced by Perl, must include some
1932:
in BRE. Furthermore, as long as the POSIX standard syntax for regexes is adhered to, there can be, and often is, additional syntax to serve specific (yet POSIX compliant) applications. Although POSIX.2 leaves some implementation specifics undefined, BRE and ERE provide a "standard" which has since
1854:, where it forms part of the syntax distinct from normal string literals. In some cases, such as sed and Perl, alternative delimiters can be used to avoid collision with contents, and to avoid having to escape occurrences of the delimiter character in the contents. For example, in sed the command 2326:
According to Ross Cox, the POSIX specification requires ambiguous subexpressions to be handled in a way different from Perl's. The committee replaced Perl's rules with one that is simple to explain, but the new "simple" rules are actually more complex to implement: they were incompatible with
3738:
IETF RFC 9485 describes "I-Regexp: An Interoperable Regular Expression Format". It specifies a limited subset of regular-expression idioms designed to be interoperable, i.e. produce the same effect, in a large number of regular-expression libraries. I-Regexp is also limited to matching, i.e.
4071:
A few theoretical alternatives to backtracking for backreferences exist, and their "exponents" are tamer in that they are only related to the number of backreferences, a fixed property of some regexp languages such as POSIX. One naive method that duplicates a non-backtracking NFA for each
4614:
Because regexes can be difficult to both explain and understand without examples, interactive websites for testing regexes are a useful resource for learning regexes by experimentation. This section provides a basic description of some of the properties of regexes by way of illustration.
497:, is often used to mean the specific, standard textual syntax for representing patterns for matching text, as distinct from the mathematical notation described below. Each character in a regular expression (that is, each character in the string describing its pattern) is either a 4311:
include a base character or combining characters partially precomposed, but not necessarily in canonical order and not necessarily using the canonical precompositions). The process of standardizing sequences of a base character + combining characters by decomposing these
47: 2051:, and exactly which characters are considered newlines is flavor-, character-encoding-, and platform-specific, but it is safe to assume that the line feed character is included). Within POSIX bracket expressions, the dot character matches a literal dot. For example, 516:
is a precise pattern (matches just 'b'). The metacharacter syntax is designed specifically to represent prescribed targets in a concise and flexible way to direct the automation of text processing of a variety of input data, in a form easy to type using a standard
1053:
Regular expressions consist of constants, which denote sets of strings, and operator symbols, which denote operations over these sets. The following definition is standard, and found as such in most textbooks on formal language theory. Given a finite
4257:). The natural extension of such character ranges to Unicode would simply change the requirement that the endpoints lie in to the requirement that they lie in . However, in practice this is often not the case. Some implementations, such as that of 4178:
space for a haystack of length n and k backreferences in the RegExp. A very recent theoretical work based on memory automata gives a tighter bound based on "active" variable nodes used, and a polynomial possibility for some backreferenced regexps.
1537: 4004:
An alternative approach is to simulate the NFA directly, essentially building each DFA state on demand and then discarding it at the next step. This keeps the DFA implicit and avoids the exponential construction cost, but running cost rises to
2327:
pre-existing tooling and made it essentially impossible to define a "lazy match" (see below) extension. As a result, very few programs actually implement the POSIX subexpression rules (even when they implement other parts of the POSIX syntax).
4271:. Some case-insensitivity flags affect only the ASCII characters. Other flags affect all characters. Some engines have two different flags, one for ASCII, the other for Unicode. Exactly which characters belong to the POSIX classes also varies. 408:
element group syntax. Prior to the use of regular expressions, many search languages allowed simple wildcards, for example "*" to match any sequence of characters, and "?" to match a single character. Relics of this can be found today in the
4563:
software has the ability to use regexes to automatically apply text styling, saving the person doing the layout from laboriously doing this by hand for anything that can be matched by a regex. For example, by defining a
1273:. In principle, the complement operator is redundant, because it does not grant any more expressive power. However, it can make a regular expression much more concise—eliminating a single complement operator can cause a 1682:
over finite words. This is a surprisingly difficult problem. As simple as the regular expressions are, there is no method to systematically rewrite them to some normal form. The lack of axiom in the past led to the
4063:
and related DFA optimization techniques such as the reverse scan. GNU grep, which supports a wide variety of POSIX syntaxes and extensions, uses BM for a first-pass prefiltering, and then uses an implicit DFA. Wu
1582:
Finally, it is worth noting that many real-world "regular expression" engines implement features that cannot be described by the regular expressions in the sense of formal language theory; rather, they implement
1739:
the pattern atoms in the regex have matched. The idea is to make a small pattern of characters stand for a large number of possible strings, rather than compiling a large list of all the literal possibilities.
250:
Regular expressions entered popular use from 1968 in two uses: pattern matching in a text editor and lexical analysis in a compiler. Among the first appearances of regular expressions in program form was when
3482:) since in many programming languages the characters that can begin an identifier are not the same as those that can occur in other positions: numbers are generally excluded, so an identifier would look like 7775: 2187:
is a digit from 1 to 9. This construct is defined in the POSIX standard. Some tools allow referencing more than nine capturing groups. Also known as a back-reference, this feature is supported in BRE mode.
1161:
under string concatenation. This is the set of all strings that can be made by concatenating any finite number (including zero) of strings from the set described by R. For example, if R denotes {"0", "1"},
6548:), but many can. For example, the set of examples {1, 10, 100}, and negative set (of counterexamples) {11, 1001, 101, 0} can be used to induce the regular expression 1⋅0* (1 followed by zero or more 0s). 1110:
denotes the set of strings that can be obtained by concatenating a string accepted by R and a string accepted by S (in that order). For example, let R denote {"ab", "c"} and S denote {"d", "ef"}. Then,
4336:. Block properties are much less useful than script properties, because a block can have code points from several different scripts, and a script can have code points from several different blocks. In 1719:. An atom is a single point within the regex pattern which it tries to match to the target string. The simplest atom is a literal, but grouping parts of the pattern to match an atom will require using 1403: 1178:
To avoid parentheses, it is assumed that the Kleene star has the highest priority followed by concatenation, then alternation. If there is no ambiguity, then parentheses may be omitted. For example,
6544:
in that regular language, it is possible to induce a grammar for the language, i.e., a regular expression that generates that language. Not all regular languages can be induced in this way (see
4048:
that contain both alternation and unbounded quantification and force the algorithm to consider an exponentially increasing number of sub-cases. This behavior can cause a security problem called
4024:. This algorithm is commonly called NFA, but this terminology can be confusing. Its running time can be exponential, which simple implementations exhibit when matching against expressions like 4068:, which implements approximate matching, combines the prefiltering into the DFA in BDM (backward DAWG matching). NR-grep's BNDM extends the BDM technique with Shift-Or bit-level parallelism. 1790:
When entering a regex in a programming language, they may be represented as a usual string literal, hence usually quoted; this is common in C, Java, and Python for instance, where the regex
1568:
requires computing the modulus of the integer base 11, and can be easily implemented with an 11-state DFA. However, converting it to a regular expression results in a 2,14 megabytes file .
8436: 4187:
In theoretical terms, any token set can be matched by regular expressions as long as it is pre-defined. In terms of historical implementations, regexes were originally written to use
4174: 4121: 1564:
In the opposite direction, there are many languages easily described by a DFA that are not easily described by a regular expression. For instance, determining the validity of a given
1553:(NFAs) that does not lead to such a blowup in size; for this reason NFAs are often used as alternative representations of regular languages. NFAs are a simple variation of the type-3 4643:
programming language, release 5.8.8, January 31, 2006. This means that other implementations may lack support for some parts of the syntax shown here (e.g. basic vs. extended regex,
4228:, that is, the characters which can be encoded with only 16 bits. Currently (as of 2016) only a few regex engines (e.g., Perl's and Java's) can handle the full 21-bit Unicode range. 1014:
These constructions can be combined to form arbitrarily complex expressions, much like one can construct arithmetical expressions from numbers and the operations +, −, ×, and ÷.
3739:
providing a true or false match between a regular expression and a given piece of text. Thus, it lacks advanced features such as capture groups, lookahead, and backreferences.
2504:), the computer's locale settings determine the contents by the numeric ordering of the character encoding. They could store digits in that sequence, or the ordering could be 1421: 385:, which are used to define Raku grammar as well as provide a tool to programmers in the language. These rules maintain existing features of Perl 5.x regexes, but also allow 720:
or members. However, there are often more concise ways: for example, the set containing the three strings "Handel", "Händel", and "Haendel" can be specified by the pattern
1284:. There is, however, a significant difference in compactness. Some classes of regular languages can only be described by deterministic finite automata whose size grows 396:
The use of regexes in structured information standards for document and database modeling started in the 1960s and expanded in the 1980s when industry standards like
3979:(DFA). The DFA can be constructed explicitly and then run on the resulting input string one symbol at a time. Constructing the DFA for a regular expression of size 2470:
The character class is the most basic regex concept after a literal match. It makes one small sequence of characters match a larger set of characters. For example,
1224:
denotes the set of binary numbers that are multiples of 3: { ε, "0", "00", "11", "000", "011", "110", "0000", "0011", "0110", "1001", "1100", "1111", "00000", ... }
7854:
If the scanner detects a transition on backref, it returns a kind of "semi-success" indicating that the match will have to be verified with a backtracking matcher.
9161: 1967:
are treated as metacharacters unless escaped; other metacharacters are known to be literal or symbolic based on context alone. Additional functionality includes
334:(which has its own, incompatible syntax and behavior). Regexes were subsequently adopted by a wide range of programs, with these early forms standardized in the 2140:
Matches the ending position of the string or the position just before a string-ending newline. In line-based tools, it matches the ending position of any line.
1959:
Perl regexes have become a de facto standard, having a rich and powerful set of atomic expressions. Perl has no "basic" or "extended" levels. As in POSIX EREs,
1671:
between regular-expression terms with variables holds if, and only if, its instantiation with different variables replaced by different symbol constants holds.
291:
meaning "Global search for Regular Expression and Print matching lines"). Around the same time when Thompson developed QED, a group of researchers including
7838: 3666:
by appending a plus sign, which disables backing off (in a backtracking engine), even if doing so would allow the overall match to succeed: While the regex
8858: 546:
similar names in a list of files, whereas regexes are usually employed in applications that pattern-match text strings in general. For example, the regex
4285:
is not applicable. For scripts like Chinese, another distinction seems logical: between traditional and simplified. In Arabic scripts, insensitivity to
4277:. As ASCII has case distinction, case insensitivity became a logical feature in text searching. Unicode introduced alphabetic scripts without case like 4188: 2512:. So the POSIX standard defines a character class, which will be known by the regex processor installed. Those definitions are in the following table: 6881: 1617:
Algebraic laws for regular expressions can be obtained using a method by Gischer which is best explained along an example: In order to check whether (
365:
implementation with improved performance characteristics. Software projects that have adopted Spencer's Tcl regular expression implementation include
8587: 7544: 6772: 2413:
The choice (also known as alternation or set union) operator matches either the expression before or the expression after the operator. For example,
1875: 3703:
Another common extension serving the same function is atomic grouping, which disables backtracking for a parenthesized group. The typical syntax is
1212:
denotes the set of all strings with no symbols other than "a" and "b", including the empty string: {ε, "a", "b", "aa", "ab", "ba", "bb", "aaa", ...}
7484: 3452:
plus underscore. This reflects the fact that in many programming languages these are the characters that may be used in identifiers. The editor
2316:
matches any single character surrounded by "" since the brackets are escaped, for example: "", "", "", "", "]", and "" (bracket space bracket).
1218:
denotes the set of strings starting with "a", then zero or more "b"s and finally optionally a "c": {"a", "ac", "ab", "abc", "abb", "abbc", ...}
3751:. For example, many implementations allow grouping subexpressions with parentheses and recalling the value they match in the same expression ( 3551:
Because of its expressive power and (relative) ease of reading, many other utilities and programming languages have adopted syntax similar to
1766:
be primarily literal, and "escape" this usual meaning to become metacharacters. Common standards implement both. The usual metacharacters are
431:(Perl Compatible Regular Expressions), which attempts to closely mimic Perl's regex functionality and is used by many modern tools including 4621:
metacharacter(s) ;; the metacharacters column specifies the regex syntax being demonstrated =~ m//  ;; indicates a regex
1699:
axioms. Already in 1964, Redko had proved that no finite set of purely equational axioms can characterize the algebra of regular languages.
9773: 9481: 8098: 9154: 8564: 7022: 6648: 4199:. In most respects it makes no difference what the character set is, but some issues do arise when extending regexes to support Unicode. 8001: 6686: 1822:
can be used to specify a range of lines (matching the pattern), which can be combined with other commands on either side, most famously
373:(formerly named Perl 6) is to improve Perl's regex integration, and to increase their scope and capabilities to allow the definition of 8848:
Kleene, Stephen C. (1951). "Representation of Events in Nerve Nets and Finite Automata". In Shannon, Claude E.; McCarthy, John (eds.).
7459: 4206:. Some regex libraries expect to work on some particular encoding instead of on abstract Unicode characters. Many of these require the 1607: 1606:
that, for two given regular expressions, decides whether the described languages are equal; the algorithm reduces each expression to a
7321:
This property need not hold for extended regular expressions, even if they describe no larger class than regular languages; cf. p.121.
9668: 8310: 7710: 4342: 9707: 6557: 3775: 2272:
matches any three-character string ending with "at", including "hat", "cat", "bat", "4at", "#at" and " at" (starting with a space).
2130:
matches any single character that is not a lowercase letter from "a" to "z". Likewise, literal characters and ranges can be mixed.
1933:
been adopted as the default syntax of many tools, where the choice of BRE or ERE modes is usually a supported option. For example,
728:
each of the three strings. However, there can be many ways to write a regular expression for the same set of strings: for example,
666:(DFA) is run on the target text string to recognize substrings that match the regular expression. The picture shows the NFA scheme 185: 5446:
Matches a zero-width boundary between a word-class character (see next) and either a non-word class character or an edge; same as
1758:, they have a metacharacter escape to a literal mode; starting out, however, they instead have the four bracketing metacharacters 9633: 4060: 1599:
As seen in many of the examples above, there is more than one way to construct a regular expression to achieve the same results.
8681: 7919: 4576:
to apply that style, any word of four or more consecutive capital letters will be automatically rendered as small caps instead.
815:, character, or group) specifies how many times the preceding element is allowed to repeat. The most common quantifiers are the 539:
also achieve this, but are more limited in what they can pattern, as they have fewer metacharacters and a simple language-base.
9638: 9368: 9147: 9059: 8029: 7151: 6602: 3630:
matches the entire line (because the entire line begins and ends with a double-quote) instead of matching only the first part,
1321: 646:
translates a regular expression in the above syntax into an internal representation that can be executed and matched against a
7574: 3730:
Possessive quantifiers are easier to implement than greedy and lazy quantifiers, and are typically more efficient at runtime.
1754:. Modern and POSIX extended regexes use metacharacters more often than their literal meaning, so to avoid "backslash-osis" or 239:"prehensible", but admitted "We would welcome any suggestions as to a more descriptive term.") Other early implementations of 9127: 9119: 9111: 9003: 8984: 8952: 8901: 8794: 8761: 8725: 8659: 8634: 8430: 7748: 7357: 7314: 6875: 6680: 6642: 8767: 283:'s use of regular expressions ("grep" is a word derived from the command for regular expression searching in the ed editor: 9643: 9293: 8496: 7658: 6545: 4444: 4049: 1572: 1120: 651: 622: 397: 6528:
Regular expressions can often be created ("induced" or "learned") based on a set of example strings. This is known as the
4001:). Note that the size of the expression is the size after abbreviations, such as numeric quantifiers, have been expanded. 2322:
matches s followed by zero or more characters, for example: "s", "saw", "seed", "s3w96.7", and "s6#h%(>>>m n mQ".
9383: 6964: 3546: 428: 40: 17: 7407: 4774:, ... later to refer to the previously matched pattern. Some implementations may use a backslash notation instead, like 3759:). This means that, among other things, a pattern can match strings of repeated words like "papa" or "WikiWiki", called 2343:) syntax. With this syntax, a backslash causes the metacharacter to be treated as a literal character. So, for example, 558:
matches excess whitespace at the beginning or end of a line. An advanced regular expression that matches any numeral is
9854: 9628: 9308: 8603: 8343:. The 'm' is only necessary if the user wishes to specify a match operation without using a forward-slash as the regex 6993: 6813: 5095:
The non-greedy match with 'l' followed by one or more characters is 'llo' rather than 'llo Wo'.
7954: 7533:"Essential classes: Regular Expressions: Quantifiers: Differences Among Greedy, Reluctant, and Possessive Quantifiers" 6903: 3747:
Many features found in virtually all modern regular expression libraries provide an expressive power that exceeds the
9722: 9663: 9535: 6581: 3972: 2150:
Defines a marked subexpression. The string matched within the parentheses can be recalled later (see the next entry,
655: 358: 8147: 2037:
Matches the starting position within the string. In line-based tools, it matches the starting position of any line.
1742:
Depending on the regex processor there are about fourteen metacharacters, characters that may or may not have their
1132:
of sets described by R and S. For example, if R describes {"ab", "c"} and S describes {"ab", "d", "ef"}, expression
9570: 9476: 9461: 7734:
Reprinted as "QED Text Editor Reference Manual", MHCC-004, Murray Hill Computing, Bell Laboratories (October 1972).
4565: 1550: 5177:
There is an 'e' followed by zero to many 'l' followed by 'o' (e.g., eo, elo, ello, elllo).
3727:
because the engine is forbidden from backtracking and so cannot try setting the group to "w" after matching "wi".
2242:
matches only "aaa", "aaaa", and "aaaaa". This is not found in a few older instances of regexes. BRE mode requires
1096:
Given regular expressions R and S, the following operations over them are defined to produce regular expressions:
369:. Perl later expanded on Spencer's original library to add many new features. Part of the effort in the design of 9910: 9895: 9337: 8355: 7055: 7976:
Schmid, Markus L. (March 2019). "Regular Expressions with Backreferences: Polynomial-Time Matching Techniques".
7798: 4664:
The syntax and conventions used in these examples coincide with that of other programming environments as well.
1575:
computes an equivalent nondeterministic finite automaton. A conversion in the opposite direction is achieved by
1280:
Regular expressions in this sense can express the regular languages, exactly the class of languages accepted by
180:. Regular expressions are supported in many programming languages. Library implementations are often called an " 9750: 8849: 7257: 7122: 6529: 6523: 4466: 3976: 3797:
for their patterns. This has led to a nomenclature where the term regular expression has different meanings in
1743: 663: 362: 272: 138: 4130: 4077: 3790:, and the execution time for known algorithms grows exponentially by the number of backreference groups used. 527:
A very simple case of a regular expression in this syntax is to locate a word spelled two different ways in a
9505:
Any language in each category is generated by a grammar and by an automaton in the category in the same line.
8382: 6562: 6537: 4496: 3568: 1549:
must have at least 2 states. Luckily, there is a simple mapping from regular expressions to the more general
1281: 716:
of strings required for a particular purpose. A simple way to specify a finite set of strings is to list its
455: 8275: 9885: 9755: 9610: 9354: 9279: 4218:. In contrast, Perl and Java are agnostic on encodings, instead operating on decoded characters internally. 3564: 1976: 689: 220: 115: 9134: 8579: 7688: 7532: 6859: 6785: 3916:
Other features not found in describing regular languages include assertions. These include the ubiquitous
9890: 9834: 9560: 9451: 9093: 8373: 8123: 7256:, a regular expression of length about 850 such that its complement has a length about 2 can be found at 7014: 6780: 4501: 4471: 3572: 3556: 1274: 1055: 501:, having a special meaning, or a regular character that has a literal meaning. For example, in the regex 451: 370: 9139: 6927: 6925: 6540:. Formally, given examples of strings in a regular language, and perhaps also given examples of strings 9839: 9717: 9684: 9620: 9347: 8920: 8688:
Proceedings of the 25th International Symposium on Theoretical Aspects of Computer Science (STACS 2008)
7866:
Kearns, Steven (August 2013). "Sublinear Matching With Finite Automata Using Reverse Suffix Scanning".
7842: 7492: 6577: 3783: 1755: 1532:{\displaystyle (a\mid b)^{*}a\underbrace {(a\mid b)(a\mid b)\cdots (a\mid b)} _{k-1{\text{ times}}}.\,} 1288:
in the size of the shortest equivalent regular expressions. The standard example here is the languages
374: 103: 8876:
Kozen, Dexter (1991). "A completeness theorem for Kleene algebras and the algebra of regular events".
7496: 4334:
Introduction of character classes for Unicode blocks, scripts, and numerous other character properties
9905: 9900: 9849: 9745: 9689: 9590: 9544: 9272: 6922: 4580: 2067:
A bracket expression. Matches a single character that is contained within the brackets. For example,
442:
Today, regexes are widely supported in programming languages, text processing programs (particularly
107: 99: 8746:
Proceedings of the 35th International Colloquium on Automata, Languages and Programming (ICALP 2008)
7270: 9844: 9648: 9580: 9424: 9419: 8532: 8376: 4517: 4456: 4225: 808: 405: 390: 275:, an important early example of JIT compilation. He later added this capability to the Unix editor 264: 9105:
Information technology – Portable Operating System Interface (POSIX) – Part 2: Shell and Utilities
7379:
Information technology – Portable Operating System Interface (POSIX) – Part 2: Shell and Utilities
6733: 247:
language, which did not use regular expressions, but instead its own pattern matching constructs.
9793: 9129:
Information technology – Portable Operating System Interface (POSIX) Base Specifications, Issue 7
7387:
Information technology – Portable Operating System Interface (POSIX) Base Specifications, Issue 7
6801:
The concept of regular events was introduced by Kleene via the definition of regular expressions.
2454:
POSIX Extended Regular Expressions can often be used with modern Unix utilities by including the
475: 9121:
Information technology – Portable Operating System Interface (POSIX) – Part 2: System Interfaces
9113:
Information technology – Portable Operating System Interface (POSIX) – Part 2: System Interfaces
8665: 7383:
Information technology – Portable Operating System Interface (POSIX) – Part 2: System Interfaces
2073:
specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed:
1542:
On the other hand, it is known that every deterministic finite automaton accepting the language
9435: 9373: 9298: 9087: 8527: 3793:
However, many tools, libraries, and engines that provide such constructions still use the term
1898: 770: 713: 236: 8512: 7294:
Gischer, Jay L. (1984). (Title unknown) (Technical Report). Stanford Univ., Dept. of Comp. Sc.
6632: 9798: 9740: 9600: 9528: 9378: 9326: 9103: 8976: 8967: 8558: 7997: 6670: 3771: 3588: 1576: 1158: 659: 91: 7455: 3782:
with an unbounded number of backreferences, as supported by numerous modern tools, is still
1746:
character meaning, depending on context, or whether they are "escaped", i.e. preceded by an
9605: 9471: 9446: 9303: 9264: 7613: 6572: 4600: 3241: 2339:
with a backslash is reversed for some characters in the POSIX Extended Regular Expression (
386: 311: 8414: 8300: 4897:"There are one or more consecutive letter \"l\"'s in $ string1.\n" 8: 9803: 8348: 4997:
There is an 'H' and a 'e' separated by 0-1 characters (e.g., He Hue Hee).
4608: 4604: 4553: 4448: 3944:
have been attested since at least 1994, starting with Perl 5. The look-behind assertions
1684: 1194:. Many textbooks use the symbols ∪, +, or ∨ for alternation instead of the vertical bar. 812: 208: 200: 126: 3817:, author of the Perl programming language, writes in an essay about the design of Raku: 1233:
The formal definition of regular expressions is minimal on purpose, and avoids defining
1075:) ε denoting the set containing only the "empty" string, which has no characters at all. 1008:
matches any string that contains an "a", and then the character "b" at some later point.
9732: 9456: 9398: 9342: 9039: 8907: 8836: 8691: 8545: 8050: 7977: 7911: 7867: 7540: 6567: 4584: 4560: 4258: 1692: 1679: 1285: 647: 543: 536: 436: 410: 125:
The concept of regular expressions began in the 1950s, when the American mathematician
4304: 4265:
editor, allow block-crossing but the character values must not be more than 256 apart.
3971:
The oldest and fastest relies on a result in formal language theory that allows every
2085:
character is treated as a literal character if it is the last or the first (after the
1640:, it is necessary and sufficient to check whether the particular regular expressions ( 9864: 9191: 8999: 8980: 8948: 8897: 8813:
Johnson, Walter L.; Porter, James H.; Ackley, Stephanie I.; Ross, Douglas T. (1968).
8790: 8757: 8721: 8655: 8651: 8630: 8426: 8422: 7888: 7310: 6871: 6867: 6676: 6638: 6533: 4529: 4330:
and text direction markers. These codes might have to be dealt with in a special way.
4262: 3453: 2941: 1558: 1021:
for regular expressions varies among tools and with context; more detail is given in
521: 295:
implemented a tool based on regular expressions that is used for lexical analysis in
256: 9055: 9043: 8911: 8840: 8815:"Automatic generation of efficient lexical processors using finite state techniques" 8701: 8549: 8483: 8347:. Sometimes it is useful to specify an alternate regex delimiter in order to avoid " 8025: 7180: 7143: 6610: 5901:
there are TWO non-whitespace characters, which may be separated by other characters.
4513: 3692:. Thus, possessive quantifiers are most useful with negated character classes, e.g. 3604:
In Python and some other implementations (e.g. Java), the three common quantifiers (
9859: 9829: 9783: 9585: 9521: 9440: 9393: 9360: 9206: 9029: 8889: 8881: 8878:[1991] Proceedings Sixth Annual IEEE Symposium on Logic in Computer Science 8826: 8749: 8537: 8296: 7915: 7903: 7719: 7650: 7603: 7566: 6708: 4316:
sequences, before reordering them into canonical order (and optionally recomposing
4282: 4191:
characters as their token set though regex libraries have supported numerous other
3779: 3748: 3617: 3584: 3529:
in other regex flavors which support them. With most other regex flavors, the term
2336: 2124:
Matches a single character that is not contained within the brackets. For example,
1811: 1034: 735:
Most formalisms provide the following operations to construct regular expressions.
447: 327: 276: 240: 212: 177: 130: 95: 7517: 7345: 4607:
in use. Additionally, the functionality of regex implementations can vary between
4528:
Regexes are useful in a wide variety of text processing tasks, and more generally
3434:
POSIX character classes can only be used within bracket expressions. For example,
401: 9712: 9653: 9565: 9403: 9318: 9285: 9201: 9174: 9170: 8736: 8466: 8359: 6956: 6855: 4533: 4327: 4278: 3798: 2444:
matches "hat", "cat", "hhat", "chat", "hcat", "cchchat", and so on, but not "at".
1747: 1042: 1038: 292: 228: 224: 181: 165: 119: 111: 8753: 8490: 7638: 7616: 7597: 5812:
there are TWO whitespace characters, which may be separated by other characters.
3801:
and pattern matching. For this reason, some people have taken to using the term
2102:
character can be included in a bracket expression if it is the first (after the
446:), advanced text editors, and some other programs. Regex support is part of the 9765: 9658: 9414: 9196: 9178: 8962: 8479: 6960: 5159:"'l' followed by 'o' (e.g., eo, elo, ello, elllo).\n" 4545: 4014: 3984: 3580: 1554: 1258: 1080: 157: 8219: 7654: 4320:
combining characters into the leading base character) is called normalization.
9879: 9575: 9552: 9499: 8885: 8471:
Handbook of Theoretical Computer Science, volume A: Algorithms and Complexity
8410: 7434: 5282:
There exists a substring with at least 1 and at most 2 l's in Hello World
4537: 4192: 3576: 1996: 1167: 1102: 1002:
matches any string that contains an "a", and then any character and then "b".
816: 498: 378: 346: 153: 8607: 8243: 6989: 5077:"more characters is 'llo' rather than 'llo Wo'.\n" 4915:
There are one or more consecutive letter "l"'s in Hello World.
4234:. For example, in ASCII-based implementations, character ranges of the form 302:
Many variations of these original forms of regular expressions were used in
141:
for writing regular expressions have existed since the 1980s, one being the
9778: 9595: 9013: 7944: 7799:"Jumbo Regexp Patch Applied (with Minor Fix-Up Tweaks): Perl/perl5@c277df4" 6907: 4632:
Also worth noting is that these regexes are all Perl-like syntax. Standard
4541: 4021: 2455: 1688: 1174:
denotes {ε, "ab", "c", "abab", "abc", "cab", "cc", "ababab", "abcab", ...}.
1071: 745: 717: 424: 252: 56: highlights show the match results of the regular expression pattern: 9034: 9017: 8831: 8814: 8541: 8074: 6968: 5734:
which in ASCII are tab, line feed, form feed, carriage return, and space;
1846:
and patterns can be joined with a comma to specify a range of lines as in
9788: 9466: 9388: 9313: 9169: 8305: 4599:
The specific syntax rules vary depending on the specific implementation,
3787: 2438:
matches "at", "hat", "cat", "hhat", "chat", "hcat", "cchchat", and so on.
2304:
matches "hat" and "cat", but only at the beginning of the string or line.
1894: 1696: 1675: 1611: 1141: 841: 830: 766: 626: 528: 161: 4020:
The third algorithm is to match the pattern against the input string by
3646:, matching as few characters as possible, by appending a question mark: 3620:
by default because they match as many characters as possible. The regex
801:
are equivalent patterns which both describe the set of "gray" or "grey".
9051: 8944: 8462: 7767: 7043: 6764: 5871:"In $ string1 there are TWO non-whitespace characters, which" 5782:"In $ string1 there are TWO whitespace characters, which may" 4569: 4476: 4246: 3956:
are attested since 1997 in a commit by Ilya Zakharevich to Perl 5.005.
3814: 3560: 618: 459: 382: 366: 232: 8683:
Succinctness of the Complement and Intersection of Regular Expressions
8352: 7768:"How to simulate lookaheads and lookbehinds in finite state automata?" 7047: 6955: 6838: 6836: 650:
representing the text being searched in. One possible approach is the
8893: 8748:. Lecture Notes in Computer Science. Vol. 5126. pp. 39–50. 8622: 8344: 7608: 5068:"The non-greedy match with 'l' followed by one or " 4521: 3965: 3673:"Ganymede," he continued, "is the largest moon in the Solar System." 3627:"Ganymede," he continued, "is the largest moon in the Solar System." 2157:). A marked subexpression is also called a block or capturing group. 1799: 1603: 1129: 834: 688:
denotes a simpler regular expression in turn, which has already been
400:(precursored by ANSI "GCA 101-1983") consolidated. The kernel of the 307: 260: 8171: 7907: 7309:. Upper Saddle River, New Jersey: Addison Wesley. pp. 117–120. 7305:
Hopcroft, John E.; Motwani, Rajeev & Ullman, Jeffrey D. (2003).
7114: 6603:"Regular Expression Tutorial - Learn How to Use Regular Expressions" 2482:
could mean any digit. Character classes apply to both POSIX levels.
7982: 6833: 5993:
99 is the first number in '99 bottles of beer on the wall.'
4970:"There is an 'H' and a 'e' separated by " 4294: 4290: 2310:
matches "hat" and "cat", but only at the end of the string or line.
1774:. The usual characters that become metacharacters when escaped are 823: 512:(match all lower case letters from 'a' to 'z') is less general and 296: 268: 8738:
Finite Automata, Digraph Connectivity, and Regular Expression Size
8696: 8195: 8026:"UTS#18 on Unicode Regular Expressions, Annex A: Character Blocks" 7872: 5412:"$ string1 contains at least one of Hello, Hi, or Pogo." 4639:
Unless otherwise indicated, the following examples conform to the
4532:, where the data need not be textual. Common applications include 3786:. The general problem of matching any number of backreferences is 2473:
could stand for any uppercase letter in the English alphabet, and
462:. In the late 2010s, several companies started to offer hardware, 227:(models of computation) and the description and classification of 9824: 8924: 8267: 7636: 4588: 4549: 4196: 2048: 335: 263:. For speed, Thompson implemented regular expression matching by 6142:"$ string1 starts with the characters 'He'.\n" 4766:
When you match a pattern within parentheses, you can use any of
1632:) denote the same regular language, for all regular expressions 9097: 9082: 8804:
Hopcroft, John E.; Motwani, Rajeev; Ullman, Jeffrey D. (2000).
7949: 7680: 4215: 4211: 2198:
Matches the preceding element zero or more times. For example,
1674:
Every regular expression can be written solely in terms of the
1150: 1058:Σ, the following constants are defined as regular expressions: 1018: 244: 31: 6261:
Matches the beginning of a string (but not an internal line).
5026:'d regex that comes before to match as few times as possible. 2399:
Matches the preceding element one or more times. For example,
1594: 732:
also specifies the same set of three strings in this example.
9429: 9018:"Programming Techniques: Regular expression search algorithm" 5150:"There is an 'e' followed by zero to many " 4656: 4625:
operation in Perl =~ s///  ;; indicates a regex
4591:. However, Google Code Search was shut down in January 2012. 4481: 4461: 4207: 4065: 2385:
Matches the preceding element zero or one time. For example,
1988: 1878: 1850:. This notation is particularly well known due to its use in 1835: 518: 458:, and is built into the syntax of others, including Perl and 443: 331: 142: 9498:
Each category of languages, except those marked by a , is a
7385:, ISO/IEC 9945-2:2003, and currently ISO/IEC/IEEE 9945:2009 5975:"$ 1 is the first number in '$ string1'\n" 4195:. Many modern regex engines offer at least some support for 3968:
that decide whether and how a given regex matches a string.
3763:
in formal language theory. The pattern for these strings is
2206:
matches "", "x", "y", "z", "zx", "zyx", "xyzzy", and so on.
9808: 8806:
Introduction to Automata Theory, Languages, and Computation
8332: 7435:"Regular Expression Matching: the Virtual Machine Approach" 7307:
Introduction to Automata Theory, Languages, and Computation
5616:
There is at least one alphanumeric character in Hello World
5494:"There is a word that ends with 'llo'.\n" 4640: 4486: 4337: 4286: 3907: 3552: 2366:
backreferences and the following metacharacters are added:
1953: 1851: 1831: 1827: 1565: 471: 463: 345:, which originally derived from a regex library written by 342: 323: 303: 280: 207:
Regular expressions originated in 1951, when mathematician
146: 134: 46: 9513: 8586:(6). The Open Group. 2004. IEEE Std 1003.1, 2004 Edition. 7203: 7201: 5108:
Matches the preceding pattern element zero or more times.
4378:
matches any character in the Armenian script. In general,
3446:
An additional non-POSIX class understood by some tools is
8484:"Chapter 10. Patterns, Automata, and Regular Expressions" 8465:(1990). "Algorithms for finding patterns in strings". In 8388: 7346:"On defining relations for the algebra of regular events" 6436:
Matches every character except the ones inside brackets.
4855:
Matches the preceding pattern element one or more times.
4824:"We matched '$ 1' and '$ 2'.\n" 4764:
Groups a series of pattern elements to a single element.
4491: 4402:
matches any uppercase letter. Binary properties that are
3592: 3443:
matches the uppercase letters and lowercase "a" and "b".
2403:
matches "abc", "abbc", "abbbc", and so on, but not "ac".
1934: 1839: 1398:{\displaystyle (a\mid b)^{*}a(a\mid b)(a\mid b)(a\mid b)} 908:
matches "abc", "abbc", "abbbc", and so on, but not "ac".
467: 432: 414: 404:
standards consists of regexes. Its use is evident in the
350: 319: 315: 196: 173: 169: 7643:
International Journal of Foundations of Computer Science
7412:
The Open Group Base Specifications Issue 7, 2018 edition
7090: 6848: 6349:
Matches the end of a string (but not an internal line).
6057:"There is at least one character in $ string1" 5598:"character in $ string1 (A-Z, a-z, 0-9, _).\n" 4928:
Matches the preceding pattern element zero or one time.
2047:
Matches any single character (many applications exclude
1170:(including the empty string). If R denotes {"ab", "c"}, 8812: 7637:
Cezar Câmpeanu; Kai Salomaa & Sheng Yu (Dec 2003).
7518:
SRE: Atomic Grouping (?>...) is not supported #34627
7198: 6842: 3634:. The aforementioned quantifiers may, however, be made 1920:, and it removes the need to escape the metacharacters 1818:
is the editor command for searching, and an expression
1588: 8939:
Liger, François; McQueen, Craig; Wilton, Paul (2002).
8331:
The character 'm' is not always required to specify a
7115:"GRegex – Faster Analytics for Unstructured Text Data" 6823: 6821: 6478:"$ string1 contains a character other than " 5719:
The space between Hello and World is not alphanumeric.
4382:
matches any character with either the binary property
4059:
Sublinear runtime algorithms have been achieved using
8803: 8645: 8627:
Sams Teach Yourself Regular Expressions in 10 Minutes
7304: 7207: 5255:"There exists a substring with at least 1 " 4524:
which uses regular expressions to identify bad titles
4133: 4080: 3742: 1424: 1324: 474:
compatible regex engines that are faster compared to
9135:
Regular Expression, IEEE Std 1003.1-2017, Open Group
8604:"Regular Expression Matching Can Be Simple and Fast" 7889:"NR-grep: a fast and flexible pattern-matching tool" 7599:
I-Regexp: An Interoperable Regular Expression Format
6937: 5880:" may be separated by other characters.\n" 5336:"$ string1 contains one or more vowels.\n" 5190:
Denotes the minimum M and the maximum N match count.
4633: 4618:
The following conventions are used in the examples.
3959: 3624:(including the double-quotes) applied to the string 1088:
in Σ denoting the set containing only the character
341:
In the 1980s, the more complicated regexes arose in
8403: 6934:, 10.11 Bibliographic Notes for Chapter 10, p. 589. 6818: 6580:– converts a regular expression into an equivalent 5542:property contains more than Latin letters, and the 4447:support regex capabilities, either natively or via 4289:may be desired. In Japanese, insensitivity between 2330: 2127:matches any character other than "a", "b", or "c". 1311:. On the one hand, a regular expression describing 1228: 904:occurrences of the preceding element. For example, 884:occurrences of the preceding element. For example, 864:occurrences of the preceding element. For example, 769:are used to define the scope and precedence of the 41:
Pointer (computer science) § Pointer-to-member
8966: 8938: 5525:Matches an alphanumeric character, including "_"; 4366:matches code points not in that block. Similarly, 4168: 4115: 3770:The language of squares is not regular, nor is it 3662:In Java and Python 3.11+, quantifiers may be made 1830:("global regex print"), which is included in most 1798:. However, they are often written with slashes as 1531: 1397: 279:, which eventually led to the popular search tool 231:, motivated by Kleene's attempt to describe early 27:Sequence of characters that forms a search pattern 8715: 7639:"A Formal Study of Practical Regular Expressions" 6752: 6248:is a line or string that ends with 'rld'. 4686:Normally matches any character except a newline. 2076:matches "a", "b", "c", "x", "y", or "z", as does 888:matches "ac", "abc", "abbc", "abbbc", and so on. 114:. Regular expression techniques are developed in 9877: 8075:"Regular expressions library - cppreference.com" 6634:The Oxford Handbook of Computational Linguistics 5791:" be separated by other characters.\n" 5264:"and at most 2 l's in $ string1\n" 4979:"0-1 characters (e.g., He Hue Hee).\n" 1655:) denote the same language over the alphabet Σ={ 30:"Regex" redirects here. For the comic book, see 8099:"Regular Expression Language - Quick Reference" 7708: 7271:"Regular expressions for deciding divisibility" 5589:"There is at least one alphanumeric " 3688:consumes the entire input, including the final 2485:When specifying a range of characters, such as 2210:matches "", "ab", "abab", "ababab", and so on. 1838:distributions. A similar convention is used in 542:The usual context of wildcard characters is in 106:for "find" or "find and replace" operations on 39:".*" redirects here. For the C++ operator, see 7381:, successively revised as ISO/IEC 9945-2:2002 7074: 7072: 6904:"An incomplete history of the QED Text Editor" 6084:There is at least one character in Hello World 4232:Extending ASCII-oriented constructs to Unicode 3521:Note that what the POSIX regex standards call 1269:matches all strings over Σ* that do not match 349:(1986), who later wrote an implementation for 9529: 9155: 8857:. Princeton University Press. pp. 3–42. 7709:Ritchie, D. M.; Thompson, K. L. (June 1970). 7681:"Perl Regular Expression Matching is NP-Hard" 5512:There is a word that ends with 'llo'. 5294:Denotes a set of possible character matches. 4287:initial, medial, final, and isolated position 2089:, if present) character within the brackets: 1723:as metacharacters. Metacharacters help form: 1295:consisting of all strings over the alphabet { 235:. (Kleene introduced it as an alternative to 145:standard and another, widely used, being the 8941:Visual Basic .NET Text Manipulation Handbook 8787:Real World Regular Expressions with Java 1.4 8734: 7241: 7091:"PCRE - Perl Compatible Regular Expressions" 6990:"New Regular Expression Features in Tcl 8.1" 6508:contains a character other than a, b, and c. 6400:"that ends with 'd\\n'.\n" 5433:contains at least one of Hello, Hi, or Pogo. 1731:telling how many atoms (and whether it is a 9477:Counter-free (with aperiodic finite monoid) 8993: 8679: 7796: 7765: 7253: 7229: 7069: 7042: 6312:"that starts with 'H'.\n" 6227:"that ends with 'rld'.\n" 6100:Matches the beginning of a line or string. 5945:"99 bottles of beer on the wall." 4842:We matched 'Hel' and 'o W'. 4730:"$ string1 has length >= 5.\n" 4688:Within square brackets the dot is literal. 3989:(2), but it can be run on a string of size 3161:Visible characters and the space character 1982: 1715:. The pattern is composed of a sequence of 1595:Deciding equivalence of regular expressions 69:followed by one or more lower-case vowels). 9536: 9522: 9162: 9148: 8325: 8020: 8018: 6218:"$ string1 is a line or string " 5738:break spaces, next line, and the variable- 4579:While regexes would be useful on Internet 1608:minimal deterministic finite state machine 535:matches both "serialise" and "serialize". 9033: 8973:Introduction to the Theory of Computation 8918: 8830: 8716:Goyvaerts, Jan; Levithan, Steven (2009). 8695: 8531: 8477: 8400:All the if statements return a TRUE value 7981: 7871: 7826: 7607: 7408:"9.3.6 BREs Matching Multiple Characters" 7397:The Single Unix Specification (Version 2) 6931: 6854: 6662: 6624: 6424:is a string that ends with 'd\n'. 6336:is a string that starts with 'H'. 5546:property contains more than Arab digits. 3596:originally developed in PCRE and Python. 2098:. Backslash escapes are not allowed. The 1528: 1041:. They have the same expressive power as 1028: 450:of many programming languages, including 9708:Comparison of regular-expression engines 9012: 8735:Gruber, Hermann; Holzer, Markus (2008). 8560:The Single UNIX Specification, Version 2 8295: 7742: 7740: 7527: 7525: 7479: 7477: 7432: 6827: 6637:. Oxford University Press. p. 754. 6558:Comparison of regular expression engines 6163:starts with the characters 'He'. 5701:"World is not alphanumeric.\n" 5692:"The space between Hello and " 5636:-alphanumeric character, excluding "_"; 4512: 4451:. Comprehensive support is included in: 4422:. Examples of non-binary properties are 4169:{\displaystyle {\mathrm {O} }(n^{2k+1})} 4116:{\displaystyle {\mathrm {O} }(n^{2k+2})} 617: 255:built Kleene's notation into the editor 195: 45: 8015: 7886: 7595: 7456:"Perl Regular Expression Documentation" 7293: 7174: 7172: 7170: 7168: 6773:"Regular Languages and Finite Automata" 6582:nondeterministic finite automaton (NFA) 5929:property, which itself the same as the 4072:backreference note has a complexity of 2359:. Additionally, support is removed for 2230:Matches the preceding element at least 2183:th marked subexpression matched, where 1881:standard has three sets of compliance: 1842:, where search and replace is given by 1206:denotes {ε, "a", "b", "bb", "bbb", ...} 968:The preceding item is matched at least 215:using his mathematical notation called 14: 9878: 9369:Linear context-free rewriting language 8961: 8847: 8784: 8643: 8510: 8409: 7975: 7957:from the original on 14 September 2020 7865: 7462:from the original on December 31, 2009 7218: 7015:"Documentation: 9.3: Pattern Matching" 6943: 6758: 6668: 6630: 6532:and is part of the general problem of 4224:. Many regex engines support only the 3657: 1869: 920:The preceding item is matched exactly 9669:Zhu–Takaoka string matching algorithm 9517: 9294:Linear context-free rewriting systems 9143: 8875: 8680:Gelade, Wouter; Neven, Frank (2008). 8621: 7887:Navarro, Gonzalo (10 November 2001). 7737: 7522: 7474: 7343: 7332: 7208:Hopcroft, Motwani & Ullman (2000) 6770: 6709:"How a Regex Engine Works Internally" 6689:from the original on 27 February 2017 6669:Lawson, Mark V. (17 September 2003). 6600: 6176:Matches the end of a line or string. 4445:general-purpose programming languages 4326:. Unicode introduced amongst others, 3904:Look-behind and look-ahead assertions 3533:is used to describe what POSIX calls 1893:(Simple Regular Expressions). SRE is 1691:axiomatized regular expressions as a 1115:denotes {"abd", "abef", "cd", "cef"}. 748:separates alternatives. For example, 708:A regular expression, often called a 677:obtained from the regular expression 137:text-processing utilities. Different 9050: 8415:"Regular Expressions, End of String" 8313:from the original on 21 October 2018 8244:"re – Regular expression operations" 7746: 7165: 7078: 6546:language identification in the limit 4210:encoding, while others might expect 4050:Regular expression Denial of Service 4017:-sregex family based on Cox's code. 2465: 1908:BRE and ERE work together. ERE adds 1901:. The subsection below covering the 1889:(Extended Regular Expressions), and 1241:—these can be expressed as follows: 1048: 996:matches any character. For example, 952:The preceding item is matched up to 102:. Usually such patterns are used by 9634:Boyer–Moore string-search algorithm 8996:Regular Expression Pocket Reference 8601: 8461: 8385:Scripting for Computational Science 7925:from the original on 7 October 2020 7815: 7778:from the original on 7 October 2020 7602:. Internet Engineering Task Force. 7577:from the original on 7 October 2020 7547:from the original on 7 October 2020 7178: 6901: 6066:" that is not a digit.\n" 5370:Separates alternate possibilities. 4636:regular expressions are different. 4438: 3964:There are at least three different 3676:matches the entire line, the regex 3587:. Some languages and tools such as 3547:Perl Compatible Regular Expressions 2202:matches "ac", "abc", "abbbc", etc. 2014:, whereas Extended Regular Syntax ( 1972: 1897:, in favor of BRE, as both provide 868:matches both "color" and "colour". 156:, in search and replace dialogs of 24: 9502:of the category directly above it. 8584:The Open Group Base Specifications 8473:. The MIT Press. pp. 255–300. 8148:"Regular expressions - JavaScript" 7350:Ukrainskii Matematicheskii Zhurnal 6391:"$ string1 is a string " 6303:"$ string1 is a string " 4136: 4083: 3960:Implementations and running times 3743:Patterns for non-regular languages 1968: 1834:-based operating systems, such as 1732: 1610:, and determines whether they are 25: 9922: 9723:Nondeterministic finite automaton 9664:Two-way string-matching algorithm 9075: 8513:"A brief history of just-in-time" 7896:Software: Practice and Experience 6771:Leung, Hing (16 September 2010). 5192:N can be omitted and M can be 0: 4061:Boyer-Moore (BM) based algorithms 3973:nondeterministic finite automaton 3700:when applied to the same string. 2608:Alphanumeric characters plus "_" 1573:Thompson's construction algorithm 1136:describes {"ab", "c", "d", "ef"}. 773:(among other uses). For example, 703: 656:nondeterministic finite automaton 652:Thompson's construction algorithm 413:syntax for filenames, and in the 133:. They came into common use with 9081: 9056:"Apocalypse 5: Pattern Matching" 8580:"Chapter 9: Regular Expressions" 7797:Zakharevich, Ilya (1997-11-19). 6734:"How Do You Actually Use Regex?" 5732:Matches a whitespace character, 4346:library, properties of the form 3983:has the time and memory cost of 3829: 3733: 3599: 3540: 2331:Metacharacters in POSIX extended 1991:standard, Basic Regular Syntax ( 1551:nondeterministic finite automata 1307:th-from-last letter equals  1229:Expressive power and compactness 402:structure specification language 326:, and in other programs such as 259:as a means to match patterns in 152:Regular expressions are used in 9062:from the original on 2010-01-12 8864:from the original on 2020-10-07 8808:(2nd ed.). Addison-Wesley. 8773:from the original on 2011-07-11 8668:from the original on 2005-08-30 8590:from the original on 2011-12-02 8567:from the original on 2020-10-07 8499:from the original on 2020-10-07 8492:Foundations of Computer Science 8439:from the original on 2020-10-07 8394: 8365: 8289: 8278:from the original on 2022-11-29 8260: 8236: 8212: 8188: 8164: 8140: 8124:"Pattern (Java Platform SE 7 )" 8116: 8091: 8067: 8043: 8032:from the original on 2020-10-07 8004:from the original on 2020-10-07 7990: 7969: 7937: 7880: 7859: 7831: 7820: 7809: 7790: 7772:Computer Science Stack Exchange 7759: 7702: 7691:from the original on 2020-10-07 7673: 7661:from the original on 2015-07-04 7630: 7589: 7559: 7511: 7448: 7426: 7400: 7391: 7371: 7360:from the original on 2018-03-29 7337: 7326: 7298: 7287: 7263: 7246: 7235: 7223: 7212: 7154:from the original on 2020-10-07 7136: 7125:from the original on 2020-10-07 7107: 7083: 7058:from the original on 2009-12-31 7036: 7025:from the original on 2020-10-07 7007: 6996:from the original on 2020-10-07 6982: 6949: 6895: 6884:from the original on 2020-10-07 6860:"A Regular Expressions Matcher" 6651:from the original on 2017-02-28 5740:width spaces (amongst others). 4556:systems, and many other tasks. 3975:(NFA) to be transformed into a 2432:matches "at", "hat", and "cat". 2294:matches all strings matched by 2284:matches all strings matched by 2058:matches only "a", ".", or "c". 1663:}. More generally, an equation 9639:Boyer–Moore–Horspool algorithm 9629:Apostolico–Giancarlo algorithm 8968:"Chapter 1: Regular Languages" 8644:Friedl, Jeffrey E. F. (2002). 8335:match operation. For example, 8051:"regex(3) - Linux manual page" 7718:. MM-70-1373-3. Archived from 7258:File:RegexComplementBlowup.png 6806: 6726: 6701: 6675:. CRC Press. pp. 98–100. 6594: 6530:induction of regular languages 6524:Induction of regular languages 4163: 4141: 4110: 4088: 3977:deterministic finite automaton 3449:, which is usually defined as 2335:The meaning of metacharacters 1948:" for BRE (the default), and " 1866:, using commas as delimiters. 1750:, in this case, the backslash 1499: 1487: 1481: 1469: 1466: 1454: 1438: 1425: 1392: 1380: 1377: 1365: 1362: 1350: 1338: 1325: 1263:generalized regular expression 1166:denotes the set of all finite 936:The preceding item is matched 664:deterministic finite automaton 357:. The Tcl library is a hybrid 273:Compatible Time-Sharing System 13: 1: 8647:Mastering Regular Expressions 8454: 7596:Bormann, Carsten; Bray, Tim. 7443:Digression: POSIX Submatching 7181:"grep(1) - Linux manual page" 6538:computational learning theory 4275:Cousins of case insensitivity 3825: 1905:applies to both BRE and ERE. 1885:(Basic Regular Expressions), 1785: 1408:Generalizing this pattern to 1282:deterministic finite automata 1261:operator is added, to give a 1033:Regular expressions describe 9644:Knuth–Morris–Pratt algorithm 9571:Damerau–Levenshtein distance 8718:Regular Expressions Cookbook 7998:"Vim documentation: pattern" 6517: 5736:in Unicode, also matches no- 5357:contains one or more vowels. 5200:matches "at least" M times; 4424:\p{Bidi_Class=Right_to_Left} 4249:in the range and codepoint( 3525:are commonly referred to as 3464:classes (using the notation 2389:matches only "ac" or "abc". 1975:, named capture groups, and 1940:has the following options: " 1571:Given a regular expression, 860:The question mark indicates 811:after an element (such as a 355:Advanced Regular Expressions 221:theoretical computer science 203:, who introduced the concept 129:formalized the concept of a 116:theoretical computer science 86:), sometimes referred to as 7: 9835:Compressed pattern matching 9561:Approximate string matching 9543: 8975:. PWS Publishing. pp.  8754:10.1007/978-3-540-70583-3_4 7485:"Regular Expression Syntax" 6781:New Mexico State University 6551: 5196:matches "exactly" M times; 4594: 4406:general categories include 2298:other than "hat" and "cat". 1602:It is possible to write an 1022: 760:can match "gray" or "grey". 724:; we say that this pattern 509:is a very general pattern, 484: 375:parsing expression grammars 104:string-searching algorithms 10: 9927: 9840:Longest common subsequence 9751:Needleman–Wunsch algorithm 9621:String-searching algorithm 9384:Deterministic context-free 9309:Deterministic context-free 8994:Stubblebine, Tony (2003). 8511:Aycock, John (June 2003). 8339:could also be rendered as 8000:. Vimdoc.sourceforge.net. 7747:Wall, Larry (1994-10-18). 7493:Python Software Foundation 7489:Python 3.5.0 documentation 7242:Gruber & Holzer (2008) 7048:"Perl Regular Expressions" 6521: 6487:"a, b, and c.\n" 6361:"Hello\nWorld\n" 6273:"Hello\nWorld\n" 5204:matches "at most" N times. 4354:match characters in block 4182: 3544: 3304:Non-whitespace characters 2378: 2213: 2191: 2169: 2143: 2133: 2118: 2070:matches "a", "b", or "c". 2061: 2040: 2030: 1756:leaning toothpick syndrome 233:artificial neural networks 191: 38: 29: 9850:Sequential pattern mining 9817: 9764: 9731: 9698: 9690:Commentz-Walter algorithm 9678:Multiple string searching 9677: 9619: 9611:Wagner–Fischer algorithm 9551: 9495: 9457:Nondeterministic pushdown 9185: 9022:Communications of the ACM 8919:Laurikari, Ville (2009). 8819:Communications of the ACM 7655:10.1142/S012905410300214X 7254:Gelade & Neven (2008) 6965:"Jargon File 4.4.7: grep" 6563:Extended Backus–Naur form 6448:"Hello World\n" 6188:"Hello World\n" 6112:"Hello World\n" 6027:"Hello World\n" 5841:"Hello World\n" 5752:"Hello World\n" 5662:"Hello World\n" 5559:"Hello World\n" 5464:"Hello World\n" 5382:"Hello World\n" 5306:"Hello World\n" 5225:"Hello World\n" 5120:"Hello World\n" 5038:"Hello World\n" 4940:"Hello World\n" 4867:"Hello World\n" 4794:"Hello World\n" 4700:"Hello World\n" 4572:and then using the regex 4025: 3921: 3917: 3903: 3501: 3483: 3474: 3465: 3450: 3447: 3435: 3427: 3408: 3398: 3392: 3383: 3364: 3354: 3348: 3339: 3329: 3319: 3309: 3288: 3270: 3260: 3250: 3246: 3235: 3216: 3208: 3199: 3180: 3170: 3164: 3155: 3136: 3126: 3120: 3111: 3092: 3084: 3075: 3065: 3055: 3045: 3034: 3024: 3006: 2996: 2986: 2982: 2973: 2954: 2946: 2884: 2874: 2808: 2798: 2783: 2772: 2753: 2743: 2737: 2728: 2709: 2699: 2693: 2684: 2674: 2664: 2654: 2643: 2633: 2623: 2613: 2602: 2583: 2575: 2566: 2547: 2498: 2490: 2486: 2474: 2471: 2106:, if present) character: 2055:matches "abc", etc., but 1702: 972:times, but not more than 778: 749: 634:* means "zero or more of 559: 547: 531:, the regular expression 188:are available for reuse. 57: 9860:String rewriting systems 9845:Longest common substring 9756:Smith–Waterman algorithm 9581:Gestalt pattern matching 8886:10.1109/LICS.1991.151646 8563:. The Open Group. 1997. 7230:Gelade & Neven (2008 6713:regular-expressions.info 6607:Regular-Expressions.info 6587: 6502: 6439: 6415: 6352: 6327: 6264: 6242: 6179: 6157: 6103: 6081: 6018: 5990: 5936: 5931:\p{Numeric_Type=Decimal} 5921:in Unicode, same as the 5895: 5832: 5806: 5743: 5716: 5653: 5613: 5550: 5509: 5455: 5427: 5373: 5351: 5297: 5279: 5216: 5174: 5111: 5092: 5029: 4994: 4931: 4912: 4858: 4839: 4785: 4745: 4691: 4386:or the general category 4226:Basic Multilingual Plane 3813:to describe the latter. 2572:Alphanumeric characters 2417:matches "abc" or "def". 2278:matches "hat" and "cat". 1983:POSIX basic and extended 1153:of the set described by 900:The plus sign indicates 391:recursive descent parser 310:in the 1970s, including 265:just-in-time compilation 9794:Generalized suffix tree 9718:Thompson's construction 9126:ISO/IEC/IEEE 9945:2009 8785:Habibi, Mehran (2004). 8557:"Regular Expressions". 7945:"travisdowns/polyregex" 7232:, p. 332, Thm.4.1) 6631:Mitkov, Ruslan (2003). 6578:Thompson's construction 4670:Meta­character(s) 4508: 4428:\p{Word_Break=A_Letter} 4222:Supported Unicode range 3527:POSIX character classes 3205:Punctuation characters 2450:matches "cat" or "dog". 1695:, using equational and 1277:blow-up of its length. 1157:that contains ε and is 1066:) ∅ denoting the set ∅. 880:The asterisk indicates 389:-style definition of a 237:McCulloch & Pitts's 9911:Programming constructs 9896:Automata (computation) 9746:Hirschberg's algorithm 9462:Deterministic pushdown 9338:Recursively enumerable 8387:, p. 320; Programming 7414:. The Open Group. 2017 5538:in Unicode, where the 5209:is thus equivalent to 4525: 4314:canonically equivalent 4170: 4117: 3823: 3799:formal language theory 3670:applied to the string 3456:further distinguishes 2690:Alphabetic characters 1899:backward compatibility 1533: 1415:gives the expression: 1399: 1039:formal language theory 1029:Formal language theory 692:translated to the NFA 639: 223:, in the subfields of 204: 71: 9601:Levenshtein automaton 9591:Jaro–Winkler distance 9035:10.1145/363347.363387 8832:10.1145/364175.364185 8542:10.1145/857076.857077 8520:ACM Computing Surveys 6932:Aho & Ullman 1992 6006:Matches a non-digit; 5927:\p{GC=Decimal_Number} 4568:that makes text into 4516: 4171: 4118: 3930:look-ahead assertions 3819: 3707:. For example, while 3242:Whitespace characters 1810:. This originates in 1534: 1400: 1149:denotes the smallest 658:(NFA), which is then 621: 199: 49: 9649:Rabin–Karp algorithm 9606:Levenshtein distance 9447:Tree stack automaton 9118:ISO/IEC 9945-2:2003 9110:ISO/IEC 9945-2:2002 9102:ISO/IEC 9945-2:1993 9090:at Wikimedia Commons 8880:. pp. 214–225. 8690:. pp. 325–336. 8268:"Regex on crates.io" 8220:"PHP: PCRE - Manual" 8172:"OCaml library: Str" 7749:"Perl 5: perlre.pod" 7458:. perldoc.perl.org. 7377:ISO/IEC 9945-2:1993 7344:Redko, V.N. (1964). 6573:Regular tree grammar 6087:that is not a digit. 5449:(^\w|\w$ |\W\w|\w\W) 4601:programming language 4552:, the production of 4432:\p{Numeric_Value=10} 4396:\p{Uppercase_Letter} 4305:combining characters 4297:is sometimes useful. 4131: 4078: 3910:regular expressions 2865:Non-word boundaries 2649:Non-word characters 2238:times. For example, 1995:) requires that the 1870:IEEE POSIX Standard 1422: 1322: 9886:Regular expressions 9804:Ternary search tree 9355:range concatenation 9280:range concatenation 9094:Regular Expressions 8921:"TRE library 0.7.6" 8419:Perl Best Practices 8362:' for more details. 8349:delimiter collision 8299:(24 October 2011). 8079:en.cppreference.com 6902:Ritchie, Dennis M. 6843:Johnson et al. 1968 5619:(A-Z, a-z, 0-9, _). 4751:has length >= 5. 4554:syntax highlighting 4376:\p{Script=Armenian} 4237:are valid wherever 3658:Possessive matching 3535:bracket expressions 3518:in POSIX notation. 3389:Hexadecimal digits 3081:Visible characters 1685:star height problem 537:Wildcard characters 491:regular expressions 470:implementations of 209:Stephen Cole Kleene 201:Stephen Cole Kleene 127:Stephen Cole Kleene 90:, is a sequence of 88:rational expression 18:Regular expressions 9891:1951 introductions 9733:Sequence alignment 9700:Regular expression 8602:Cox, Russ (2007). 8480:Ullman, Jeffrey D. 8358:2009-12-31 at the 7839:"gnulib/lib/dfa.c" 7537:The Java Tutorials 7179:Kerrisk, Michael. 6791:on 5 December 2013 6568:Matching wildcards 5400:m/(Hello|Hi|Pogo)/ 5211:x{0,} y{1,} z{0,1} 4629:operation in Perl 4585:Google Code Search 4561:desktop publishing 4526: 4269:Case insensitivity 4204:Supported encoding 4166: 4113: 3795:regular expression 3345:Uppercase letters 3117:Lowercase letters 2942:Control characters 2234:and not more than 2159:BRE mode requires 1591:for more on this. 1577:Kleene's algorithm 1529: 1524: 1506: 1395: 1275:double exponential 1190:can be written as 1182:can be written as 829:(derived from the 662:and the resulting 660:made deterministic 640: 437:Apache HTTP Server 423:Starting in 1997, 377:. The result is a 338:standard in 1992. 205: 168:utilities such as 76:regular expression 72: 9873: 9872: 9865:String operations 9511: 9510: 9490: 9489: 9452:Embedded pushdown 9348:Context-sensitive 9273:Context-sensitive 9207:Abstract machines 9192:Chomsky hierarchy 9086:Media related to 9005:978-0-596-00415-6 8986:978-0-534-94728-6 8954:978-1-86100-730-8 8903:978-0-8186-2230-4 8796:978-1-59059-107-9 8763:978-3-540-70582-6 8727:978-0-596-52068-7 8661:978-0-596-00289-3 8636:978-0-672-32566-3 8432:978-0-596-00173-5 8297:Horowitz, Bradley 7902:(13): 1265–1312. 7766:Wandering Logic. 7567:"Atomic Grouping" 7433:Ross Cox (2009). 7316:978-0-201-44124-6 6877:978-0-596-51004-6 6682:978-1-58488-255-8 6644:978-0-19-927634-9 6534:grammar induction 6515: 6514: 5914:Matches a digit; 5825:Matches anything 4530:string processing 4324:New control codes 3914: 3913: 3784:context sensitive 3749:regular languages 3523:character classes 3432: 3431: 2538:ASCII characters 2466:Character classes 2421: 2420: 2261: 2260: 2179:Matches what the 1903:character classes 1844:s/re/replacement/ 1711:matches a target 1559:Chomsky hierarchy 1521: 1452: 1450: 1081:literal character 1049:Formal definition 1035:regular languages 980: 979: 730:(Hän|Han|Haen)del 219:. These arose in 213:regular languages 94:that specifies a 16:(Redirected from 9918: 9906:Pattern matching 9901:Formal languages 9830:Pattern matching 9784:Suffix automaton 9586:Hamming distance 9538: 9531: 9524: 9515: 9514: 9506: 9503: 9467:Visibly pushdown 9441:Thread automaton 9389:Visibly pushdown 9357: 9314:Visibly pushdown 9282: 9269:(no common name) 9188: 9187: 9175:formal languages 9164: 9157: 9150: 9141: 9140: 9085: 9070: 9068: 9067: 9047: 9037: 9009: 8990: 8970: 8958: 8935: 8933: 8932: 8923:. Archived from 8915: 8872: 8870: 8869: 8863: 8856: 8851:Automata Studies 8844: 8834: 8809: 8800: 8781: 8779: 8778: 8772: 8743: 8731: 8712: 8710: 8709: 8700:. Archived from 8699: 8676: 8674: 8673: 8640: 8618: 8616: 8615: 8606:. Archived from 8598: 8596: 8595: 8575: 8573: 8572: 8553: 8535: 8517: 8507: 8505: 8504: 8488: 8478:Aho, Alfred V.; 8474: 8467:van Leeuwen, Jan 8448: 8447: 8445: 8444: 8407: 8401: 8398: 8392: 8369: 8363: 8342: 8338: 8329: 8323: 8322: 8320: 8318: 8293: 8287: 8286: 8284: 8283: 8264: 8258: 8257: 8255: 8254: 8240: 8234: 8233: 8231: 8230: 8216: 8210: 8209: 8207: 8206: 8200:perldoc.perl.org 8192: 8186: 8185: 8183: 8182: 8168: 8162: 8161: 8159: 8158: 8144: 8138: 8137: 8135: 8134: 8120: 8114: 8113: 8111: 8110: 8095: 8089: 8088: 8086: 8085: 8071: 8065: 8064: 8062: 8061: 8047: 8041: 8040: 8038: 8037: 8022: 8013: 8012: 8010: 8009: 7994: 7988: 7987: 7985: 7973: 7967: 7966: 7964: 7962: 7941: 7935: 7934: 7932: 7930: 7924: 7893: 7884: 7878: 7877: 7875: 7863: 7857: 7856: 7851: 7850: 7841:. Archived from 7835: 7829: 7827:Laurikari (2009) 7824: 7818: 7813: 7807: 7806: 7794: 7788: 7787: 7785: 7783: 7763: 7757: 7756: 7744: 7735: 7733: 7731: 7730: 7724: 7717: 7706: 7700: 7699: 7697: 7696: 7677: 7671: 7669: 7667: 7666: 7649:(6): 1007–1018. 7634: 7628: 7627: 7625: 7623: 7611: 7609:10.17487/RFC9485 7593: 7587: 7586: 7584: 7582: 7563: 7557: 7556: 7554: 7552: 7529: 7520: 7515: 7509: 7508: 7506: 7504: 7495:. Archived from 7481: 7472: 7471: 7469: 7467: 7452: 7446: 7445: 7430: 7424: 7423: 7421: 7419: 7404: 7398: 7395: 7389: 7375: 7369: 7368: 7366: 7365: 7341: 7335: 7330: 7324: 7323: 7302: 7296: 7295: 7291: 7285: 7284: 7282: 7281: 7267: 7261: 7250: 7244: 7239: 7233: 7227: 7221: 7216: 7210: 7205: 7196: 7195: 7193: 7191: 7176: 7163: 7162: 7160: 7159: 7140: 7134: 7133: 7131: 7130: 7111: 7105: 7104: 7102: 7101: 7087: 7081: 7076: 7067: 7066: 7064: 7063: 7040: 7034: 7033: 7031: 7030: 7011: 7005: 7004: 7002: 7001: 6986: 6980: 6979: 6977: 6976: 6967:. Archived from 6957:Raymond, Eric S. 6953: 6947: 6941: 6935: 6929: 6920: 6919: 6917: 6915: 6906:. Archived from 6899: 6893: 6892: 6890: 6889: 6870:. pp. 1–2. 6856:Kernighan, Brian 6852: 6846: 6840: 6831: 6825: 6816: 6810: 6804: 6803: 6798: 6796: 6790: 6784:. Archived from 6777: 6768: 6762: 6756: 6750: 6749: 6747: 6745: 6730: 6724: 6723: 6721: 6719: 6705: 6699: 6698: 6696: 6694: 6666: 6660: 6659: 6657: 6656: 6628: 6622: 6621: 6619: 6618: 6609:. Archived from 6601:Goyvaerts, Jan. 6598: 6509: 6506: 6494: 6491: 6488: 6485: 6482: 6479: 6476: 6473: 6470: 6467: 6464: 6461: 6458: 6455: 6452: 6449: 6446: 6443: 6433: 6425: 6422: 6419: 6407: 6404: 6401: 6398: 6395: 6392: 6389: 6386: 6383: 6380: 6377: 6374: 6371: 6368: 6365: 6362: 6359: 6356: 6346: 6337: 6334: 6331: 6319: 6316: 6313: 6310: 6307: 6304: 6301: 6298: 6295: 6292: 6289: 6286: 6283: 6280: 6277: 6274: 6271: 6268: 6258: 6249: 6246: 6234: 6231: 6228: 6225: 6222: 6219: 6216: 6213: 6210: 6207: 6204: 6201: 6198: 6195: 6192: 6189: 6186: 6183: 6173: 6164: 6161: 6149: 6146: 6143: 6140: 6137: 6134: 6131: 6128: 6125: 6122: 6119: 6116: 6113: 6110: 6107: 6097: 6088: 6085: 6073: 6070: 6067: 6064: 6061: 6058: 6055: 6052: 6049: 6046: 6043: 6040: 6037: 6034: 6031: 6028: 6025: 6022: 6014: 6010: 6003: 5994: 5982: 5979: 5976: 5973: 5970: 5967: 5964: 5961: 5958: 5955: 5952: 5949: 5946: 5943: 5940: 5932: 5928: 5924: 5918: 5911: 5902: 5899: 5887: 5884: 5881: 5878: 5875: 5872: 5869: 5866: 5863: 5860: 5857: 5854: 5851: 5848: 5845: 5842: 5839: 5836: 5822: 5813: 5810: 5798: 5795: 5792: 5789: 5786: 5783: 5780: 5777: 5774: 5771: 5768: 5765: 5762: 5759: 5756: 5753: 5750: 5747: 5739: 5737: 5729: 5720: 5708: 5705: 5702: 5699: 5696: 5693: 5690: 5687: 5684: 5681: 5678: 5675: 5672: 5669: 5666: 5663: 5660: 5657: 5645: 5640: 5629: 5620: 5617: 5605: 5602: 5599: 5596: 5593: 5590: 5587: 5584: 5581: 5578: 5575: 5572: 5569: 5566: 5563: 5560: 5557: 5554: 5545: 5541: 5534: 5529: 5522: 5513: 5501: 5498: 5495: 5492: 5489: 5486: 5483: 5480: 5477: 5474: 5471: 5468: 5465: 5462: 5459: 5450: 5443: 5434: 5431: 5419: 5416: 5413: 5410: 5407: 5404: 5401: 5398: 5395: 5392: 5389: 5386: 5383: 5380: 5377: 5367: 5358: 5355: 5343: 5340: 5337: 5334: 5331: 5328: 5325: 5322: 5319: 5316: 5313: 5310: 5307: 5304: 5301: 5291: 5283: 5271: 5268: 5265: 5262: 5259: 5256: 5253: 5250: 5247: 5244: 5241: 5238: 5235: 5232: 5229: 5226: 5223: 5220: 5212: 5208: 5203: 5199: 5195: 5187: 5178: 5166: 5163: 5160: 5157: 5154: 5151: 5148: 5145: 5142: 5139: 5136: 5133: 5130: 5127: 5124: 5121: 5118: 5115: 5105: 5096: 5084: 5081: 5078: 5075: 5072: 5069: 5066: 5063: 5060: 5057: 5054: 5051: 5048: 5045: 5042: 5039: 5036: 5033: 5025: 5021: 5017: 5013: 5007: 4998: 4986: 4983: 4980: 4977: 4974: 4971: 4968: 4965: 4962: 4959: 4956: 4953: 4950: 4947: 4944: 4941: 4938: 4935: 4925: 4916: 4904: 4901: 4898: 4895: 4892: 4889: 4886: 4883: 4880: 4877: 4874: 4871: 4868: 4865: 4862: 4852: 4843: 4831: 4828: 4825: 4822: 4819: 4816: 4813: 4810: 4807: 4804: 4801: 4798: 4795: 4792: 4789: 4781: 4777: 4773: 4769: 4761: 4752: 4749: 4737: 4734: 4731: 4728: 4725: 4722: 4719: 4716: 4713: 4710: 4707: 4704: 4701: 4698: 4695: 4683: 4667: 4666: 4660: 4654: 4650: 4646: 4575: 4439:Language support 4433: 4429: 4425: 4421: 4417: 4413: 4409: 4401: 4397: 4393: 4381: 4377: 4373: 4369: 4365: 4361: 4353: 4349: 4345: 4328:byte order marks 4283:case sensitivity 4236: 4177: 4175: 4173: 4172: 4167: 4162: 4161: 4140: 4139: 4124: 4122: 4120: 4119: 4114: 4109: 4108: 4087: 4086: 4047: 4046: 4043: 4040: 4037: 4034: 4031: 4028: 3955: 3954: 3949: 3948: 3943: 3942: 3937: 3936: 3932: 3931: 3923: 3919: 3899: 3897: 3894: 3885: 3883: 3868: 3866: 3856: 3854: 3830: 3780:pattern matching 3766: 3757: 3756: 3726: 3722: 3718: 3714: 3710: 3706: 3699: 3696:, which matches 3695: 3691: 3687: 3682:not match at all 3679: 3669: 3653: 3649: 3633: 3623: 3615: 3611: 3607: 3555:'s—for example, 3517: 3516: 3513: 3510: 3507: 3504: 3499: 3498: 3495: 3492: 3489: 3486: 3481: 3480: 3477: 3472: 3471: 3468: 3451: 3448: 3442: 3441: 3438: 3428: 3424: 3423: 3420: 3417: 3414: 3411: 3405: 3404: 3401: 3393: 3384: 3380: 3379: 3376: 3373: 3370: 3367: 3361: 3360: 3357: 3349: 3340: 3336: 3335: 3332: 3326: 3325: 3322: 3316: 3315: 3312: 3299: 3295: 3294: 3291: 3286: 3285: 3282: 3279: 3276: 3273: 3267: 3266: 3263: 3257: 3256: 3253: 3247: 3236: 3232: 3231: 3228: 3225: 3222: 3219: 3209: 3200: 3196: 3195: 3192: 3189: 3186: 3183: 3177: 3176: 3173: 3165: 3156: 3152: 3151: 3148: 3145: 3142: 3139: 3133: 3132: 3129: 3121: 3112: 3108: 3107: 3104: 3101: 3098: 3095: 3085: 3076: 3072: 3071: 3068: 3062: 3061: 3058: 3052: 3051: 3048: 3035: 3031: 3030: 3027: 3022: 3021: 3018: 3015: 3012: 3009: 3003: 3002: 2999: 2993: 2992: 2989: 2983: 2974: 2970: 2969: 2966: 2963: 2960: 2957: 2947: 2936: 2935: 2932: 2929: 2926: 2923: 2920: 2917: 2914: 2911: 2908: 2905: 2902: 2899: 2896: 2893: 2890: 2887: 2881: 2880: 2877: 2860: 2859: 2856: 2853: 2850: 2847: 2844: 2841: 2838: 2835: 2832: 2829: 2826: 2823: 2820: 2817: 2814: 2811: 2805: 2804: 2801: 2795: 2790: 2789: 2786: 2778:Word boundaries 2773: 2769: 2768: 2765: 2762: 2759: 2756: 2750: 2749: 2746: 2738: 2729: 2725: 2724: 2721: 2718: 2715: 2712: 2706: 2705: 2702: 2694: 2685: 2681: 2680: 2677: 2671: 2670: 2667: 2661: 2660: 2657: 2644: 2640: 2639: 2636: 2630: 2629: 2626: 2620: 2619: 2616: 2603: 2599: 2598: 2595: 2592: 2589: 2586: 2576: 2567: 2563: 2562: 2559: 2556: 2553: 2550: 2515: 2514: 2502: 2501: 2494: 2493: 2488:(i.e. lowercase 2487: 2481: 2480: 2477: 2472: 2449: 2443: 2437: 2431: 2416: 2410: 2402: 2396: 2388: 2382: 2369: 2368: 2365: 2358: 2354: 2350: 2346: 2321: 2315: 2309: 2303: 2297: 2293: 2287: 2283: 2277: 2271: 2256: 2254: 2241: 2227: 2226: 2209: 2205: 2201: 2195: 2176: 2164: 2163: 2156: 2147: 2137: 2129: 2126: 2121: 2113: 2109: 2105: 2101: 2097: 2094: 2091: 2088: 2084: 2078: 2075: 2072: 2069: 2064: 2057: 2054: 2044: 2034: 2021: 2020: 2013: 2009: 2005: 2001: 1966: 1962: 1951: 1947: 1944:" for ERE, and " 1943: 1939: 1927: 1923: 1919: 1915: 1911: 1865: 1861: 1857: 1849: 1845: 1825: 1821: 1817: 1809: 1805: 1797: 1793: 1781: 1777: 1773: 1769: 1765: 1761: 1753: 1722: 1538: 1536: 1535: 1530: 1523: 1522: 1519: 1507: 1502: 1446: 1445: 1404: 1402: 1401: 1396: 1346: 1345: 1257:. Sometimes the 1256: 1252: 1248: 1244: 1240: 1236: 1223: 1222:(0|(1(01*0)*1))* 1217: 1211: 1205: 1193: 1189: 1185: 1181: 1173: 1165: 1148: 1135: 1127: 1114: 1109: 1087: 1043:regular grammars 1007: 1001: 995: 965: 949: 933: 917: 907: 897: 887: 877: 867: 857: 850: 849: 839: 828: 821: 800: 799: 796: 793: 790: 787: 784: 781: 776: 759: 758: 755: 752: 731: 723: 683: 676: 614: 613: 610: 607: 604: 601: 598: 595: 592: 589: 586: 583: 580: 577: 574: 571: 568: 565: 562: 557: 556: 553: 550: 534: 515: 511: 508: 504: 448:standard library 419: 290: 241:pattern matching 229:formal languages 178:lexical analysis 131:regular language 112:input validation 64: 63: 60: 55: 54: 21: 9926: 9925: 9921: 9920: 9919: 9917: 9916: 9915: 9876: 9875: 9874: 9869: 9813: 9760: 9727: 9713:Regular grammar 9694: 9673: 9654:Raita algorithm 9615: 9566:Bitap algorithm 9547: 9542: 9512: 9507: 9504: 9497: 9491: 9486: 9408: 9352: 9331: 9277: 9258: 9181: 9179:formal grammars 9171:Automata theory 9168: 9078: 9073: 9065: 9063: 9006: 8987: 8963:Sipser, Michael 8955: 8930: 8928: 8904: 8867: 8865: 8861: 8854: 8825:(12): 805–813. 8797: 8776: 8774: 8770: 8764: 8741: 8728: 8707: 8705: 8671: 8669: 8662: 8637: 8613: 8611: 8593: 8591: 8578: 8570: 8568: 8556: 8515: 8502: 8500: 8486: 8457: 8452: 8451: 8442: 8440: 8433: 8425:. p. 240. 8408: 8404: 8399: 8395: 8370: 8366: 8360:Wayback Machine 8340: 8336: 8330: 8326: 8316: 8314: 8294: 8290: 8281: 8279: 8266: 8265: 8261: 8252: 8250: 8248:docs.python.org 8242: 8241: 8237: 8228: 8226: 8218: 8217: 8213: 8204: 8202: 8194: 8193: 8189: 8180: 8178: 8170: 8169: 8165: 8156: 8154: 8146: 8145: 8141: 8132: 8130: 8128:docs.oracle.com 8122: 8121: 8117: 8108: 8106: 8097: 8096: 8092: 8083: 8081: 8073: 8072: 8068: 8059: 8057: 8049: 8048: 8044: 8035: 8033: 8024: 8023: 8016: 8007: 8005: 7996: 7995: 7991: 7974: 7970: 7960: 7958: 7953:. 5 July 2019. 7943: 7942: 7938: 7928: 7926: 7922: 7908:10.1002/spe.411 7891: 7885: 7881: 7864: 7860: 7848: 7846: 7837: 7836: 7832: 7825: 7821: 7814: 7810: 7795: 7791: 7781: 7779: 7764: 7760: 7745: 7738: 7728: 7726: 7722: 7715: 7712:QED Text Editor 7707: 7703: 7694: 7692: 7685:perl.plover.com 7679: 7678: 7674: 7670:Theorem 3 (p.9) 7664: 7662: 7635: 7631: 7621: 7619: 7594: 7590: 7580: 7578: 7565: 7564: 7560: 7550: 7548: 7531: 7530: 7523: 7516: 7512: 7502: 7500: 7499:on 18 July 2018 7483: 7482: 7475: 7465: 7463: 7454: 7453: 7449: 7431: 7427: 7417: 7415: 7406: 7405: 7401: 7396: 7392: 7376: 7372: 7363: 7361: 7342: 7338: 7331: 7327: 7317: 7303: 7299: 7292: 7288: 7279: 7277: 7269: 7268: 7264: 7251: 7247: 7240: 7236: 7228: 7224: 7217: 7213: 7206: 7199: 7189: 7187: 7177: 7166: 7157: 7155: 7148:bkase.github.io 7142: 7141: 7137: 7128: 7126: 7113: 7112: 7108: 7099: 7097: 7089: 7088: 7084: 7077: 7070: 7061: 7059: 7041: 7037: 7028: 7026: 7013: 7012: 7008: 6999: 6997: 6988: 6987: 6983: 6974: 6972: 6954: 6950: 6942: 6938: 6930: 6923: 6913: 6911: 6900: 6896: 6887: 6885: 6878: 6853: 6849: 6841: 6834: 6826: 6819: 6811: 6807: 6794: 6792: 6788: 6775: 6769: 6765: 6757: 6753: 6743: 6741: 6740:. 11 March 2020 6732: 6731: 6727: 6717: 6715: 6707: 6706: 6702: 6692: 6690: 6683: 6672:Finite Automata 6667: 6663: 6654: 6652: 6645: 6629: 6625: 6616: 6614: 6599: 6595: 6590: 6554: 6526: 6520: 6511: 6510: 6507: 6504: 6496: 6495: 6492: 6489: 6486: 6483: 6480: 6477: 6474: 6471: 6468: 6465: 6462: 6459: 6456: 6453: 6450: 6447: 6444: 6441: 6432: 6427: 6426: 6423: 6420: 6417: 6409: 6408: 6405: 6402: 6399: 6396: 6393: 6390: 6387: 6384: 6381: 6378: 6375: 6372: 6369: 6366: 6363: 6360: 6357: 6354: 6344: 6339: 6338: 6335: 6332: 6329: 6321: 6320: 6317: 6314: 6311: 6308: 6305: 6302: 6299: 6296: 6293: 6290: 6287: 6284: 6281: 6278: 6275: 6272: 6269: 6266: 6256: 6251: 6250: 6247: 6244: 6236: 6235: 6232: 6229: 6226: 6223: 6220: 6217: 6214: 6211: 6208: 6205: 6202: 6199: 6196: 6193: 6190: 6187: 6184: 6181: 6171: 6166: 6165: 6162: 6159: 6151: 6150: 6147: 6144: 6141: 6138: 6135: 6132: 6129: 6126: 6123: 6120: 6117: 6114: 6111: 6108: 6105: 6095: 6090: 6089: 6086: 6083: 6075: 6074: 6071: 6068: 6065: 6062: 6059: 6056: 6053: 6050: 6047: 6044: 6041: 6038: 6035: 6032: 6029: 6026: 6023: 6020: 6012: 6009: 6007: 6001: 5996: 5995: 5992: 5984: 5983: 5980: 5977: 5974: 5971: 5968: 5965: 5962: 5959: 5956: 5953: 5950: 5947: 5944: 5941: 5938: 5930: 5926: 5922: 5920: 5917: 5915: 5909: 5904: 5903: 5900: 5897: 5889: 5888: 5885: 5882: 5879: 5876: 5873: 5870: 5867: 5864: 5861: 5858: 5855: 5852: 5849: 5846: 5843: 5840: 5837: 5834: 5820: 5815: 5814: 5811: 5808: 5800: 5799: 5796: 5793: 5790: 5787: 5784: 5781: 5778: 5775: 5772: 5769: 5766: 5763: 5760: 5757: 5754: 5751: 5748: 5745: 5735: 5733: 5727: 5722: 5721: 5718: 5710: 5709: 5706: 5703: 5700: 5697: 5694: 5691: 5688: 5685: 5682: 5679: 5676: 5673: 5670: 5667: 5664: 5661: 5658: 5655: 5644: 5639: 5637: 5627: 5622: 5621: 5618: 5615: 5607: 5606: 5603: 5600: 5597: 5594: 5591: 5588: 5585: 5582: 5579: 5576: 5573: 5570: 5567: 5564: 5561: 5558: 5555: 5552: 5543: 5539: 5533: 5528: 5526: 5520: 5515: 5514: 5511: 5503: 5502: 5499: 5496: 5493: 5490: 5487: 5484: 5481: 5478: 5475: 5472: 5469: 5466: 5463: 5460: 5457: 5448: 5441: 5436: 5435: 5432: 5429: 5421: 5420: 5417: 5414: 5411: 5408: 5405: 5402: 5399: 5396: 5393: 5390: 5387: 5384: 5381: 5378: 5375: 5365: 5360: 5359: 5356: 5353: 5345: 5344: 5341: 5338: 5335: 5332: 5329: 5326: 5323: 5320: 5317: 5314: 5311: 5308: 5305: 5302: 5299: 5290: 5285: 5284: 5281: 5273: 5272: 5269: 5266: 5263: 5260: 5257: 5254: 5251: 5248: 5245: 5242: 5239: 5236: 5233: 5230: 5227: 5224: 5221: 5218: 5210: 5206: 5205: 5201: 5197: 5193: 5191: 5185: 5180: 5179: 5176: 5168: 5167: 5164: 5161: 5158: 5155: 5152: 5149: 5146: 5143: 5140: 5137: 5134: 5131: 5128: 5125: 5122: 5119: 5116: 5113: 5103: 5098: 5097: 5094: 5086: 5085: 5082: 5079: 5076: 5073: 5070: 5067: 5064: 5061: 5058: 5055: 5052: 5049: 5046: 5043: 5040: 5037: 5034: 5031: 5023: 5019: 5015: 5011: 5005: 5000: 4999: 4996: 4988: 4987: 4984: 4981: 4978: 4975: 4972: 4969: 4966: 4963: 4960: 4957: 4954: 4951: 4948: 4945: 4942: 4939: 4936: 4933: 4923: 4918: 4917: 4914: 4906: 4905: 4902: 4899: 4896: 4893: 4890: 4887: 4884: 4881: 4878: 4875: 4872: 4869: 4866: 4863: 4860: 4850: 4845: 4844: 4841: 4833: 4832: 4829: 4826: 4823: 4820: 4817: 4814: 4811: 4808: 4805: 4802: 4799: 4796: 4793: 4790: 4787: 4779: 4775: 4771: 4767: 4765: 4759: 4754: 4753: 4750: 4747: 4739: 4738: 4735: 4732: 4729: 4726: 4723: 4720: 4717: 4714: 4711: 4708: 4705: 4702: 4699: 4696: 4693: 4687: 4681: 4659: 4652: 4648: 4644: 4630: 4597: 4573: 4566:character style 4534:data validation 4511: 4506: 4441: 4431: 4427: 4423: 4419: 4415: 4411: 4408:\p{White_Space} 4407: 4399: 4395: 4391: 4390:. For example, 4379: 4375: 4371: 4367: 4363: 4359: 4351: 4347: 4343:java.util.regex 4341: 4235: 4185: 4148: 4144: 4135: 4134: 4132: 4129: 4128: 4126: 4095: 4091: 4082: 4081: 4079: 4076: 4075: 4073: 4044: 4041: 4038: 4035: 4032: 4029: 4026: 3962: 3952: 3951: 3946: 3945: 3940: 3939: 3934: 3933: 3929: 3928: 3905: 3895: 3890: 3888: 3881: 3876: 3864: 3859: 3852: 3847: 3828: 3764: 3754: 3753: 3745: 3736: 3724: 3721:^(?>wi|w)i$ 3720: 3716: 3712: 3708: 3704: 3697: 3693: 3689: 3685: 3677: 3674: 3667: 3660: 3651: 3647: 3631: 3628: 3621: 3613: 3609: 3605: 3602: 3549: 3543: 3531:character class 3514: 3511: 3508: 3505: 3502: 3496: 3493: 3490: 3487: 3484: 3478: 3475: 3469: 3466: 3439: 3436: 3421: 3418: 3415: 3412: 3409: 3402: 3399: 3377: 3374: 3371: 3368: 3365: 3358: 3355: 3333: 3330: 3323: 3320: 3313: 3310: 3298: 3292: 3289: 3283: 3280: 3277: 3274: 3271: 3264: 3261: 3254: 3251: 3229: 3226: 3223: 3220: 3217: 3193: 3190: 3187: 3184: 3181: 3174: 3171: 3149: 3146: 3143: 3140: 3137: 3130: 3127: 3105: 3102: 3099: 3096: 3093: 3069: 3066: 3059: 3056: 3049: 3046: 3028: 3025: 3019: 3016: 3013: 3010: 3007: 3000: 2997: 2990: 2987: 2967: 2964: 2961: 2958: 2955: 2933: 2930: 2927: 2924: 2921: 2918: 2915: 2912: 2909: 2906: 2903: 2900: 2897: 2894: 2891: 2888: 2885: 2878: 2875: 2857: 2854: 2851: 2848: 2845: 2842: 2839: 2836: 2833: 2830: 2827: 2824: 2821: 2818: 2815: 2812: 2809: 2802: 2799: 2793: 2787: 2784: 2766: 2763: 2760: 2757: 2754: 2747: 2744: 2722: 2719: 2716: 2713: 2710: 2703: 2700: 2678: 2675: 2668: 2665: 2658: 2655: 2637: 2634: 2627: 2624: 2617: 2614: 2596: 2593: 2590: 2587: 2584: 2560: 2557: 2554: 2551: 2548: 2499: 2491: 2478: 2475: 2468: 2461: 2447: 2441: 2435: 2429: 2414: 2408: 2400: 2394: 2386: 2380: 2360: 2356: 2352: 2348: 2344: 2333: 2319: 2313: 2307: 2301: 2295: 2291: 2285: 2281: 2275: 2269: 2244: 2243: 2239: 2216: 2215: 2207: 2203: 2199: 2193: 2171: 2161: 2160: 2151: 2145: 2135: 2128: 2125: 2120: 2111: 2107: 2103: 2099: 2096: 2093: 2090: 2086: 2082: 2077: 2074: 2071: 2068: 2063: 2056: 2052: 2042: 2032: 2011: 2007: 2003: 1999: 1985: 1964: 1960: 1949: 1945: 1941: 1937: 1925: 1921: 1917: 1913: 1909: 1872: 1863: 1859: 1858:will replace a 1855: 1847: 1843: 1823: 1819: 1815: 1807: 1803: 1795: 1791: 1788: 1779: 1775: 1771: 1767: 1763: 1759: 1751: 1748:escape sequence 1720: 1705: 1597: 1547: 1518: 1508: 1453: 1451: 1441: 1437: 1423: 1420: 1419: 1413: 1341: 1337: 1323: 1320: 1319: 1317: 1293: 1254: 1250: 1246: 1242: 1238: 1234: 1231: 1221: 1215: 1209: 1203: 1191: 1187: 1183: 1179: 1171: 1163: 1146: 1133: 1125: 1112: 1107: 1085: 1051: 1031: 1005: 999: 991: 961: 945: 940:or more times. 929: 913: 905: 893: 885: 873: 865: 853: 837: 826: 819: 797: 794: 791: 788: 785: 782: 779: 774: 756: 753: 750: 729: 721: 706: 678: 667: 654:to construct a 644:regex processor 629: 611: 608: 605: 602: 599: 596: 593: 590: 587: 584: 581: 578: 575: 572: 569: 566: 563: 560: 554: 551: 548: 532: 513: 510: 506: 502: 487: 478:implementations 417: 393:via sub-rules. 293:Douglas T. Ross 284: 225:automata theory 194: 166:text processing 158:word processors 120:formal language 70: 61: 58: 52: 51: 44: 37: 28: 23: 22: 15: 12: 11: 5: 9924: 9914: 9913: 9908: 9903: 9898: 9893: 9888: 9871: 9870: 9868: 9867: 9862: 9857: 9852: 9847: 9842: 9837: 9832: 9827: 9821: 9819: 9815: 9814: 9812: 9811: 9806: 9801: 9796: 9791: 9786: 9781: 9776: 9770: 9768: 9766:Data structure 9762: 9761: 9759: 9758: 9753: 9748: 9743: 9737: 9735: 9729: 9728: 9726: 9725: 9720: 9715: 9710: 9704: 9702: 9696: 9695: 9693: 9692: 9687: 9681: 9679: 9675: 9674: 9672: 9671: 9666: 9661: 9659:Trigram search 9656: 9651: 9646: 9641: 9636: 9631: 9625: 9623: 9617: 9616: 9614: 9613: 9608: 9603: 9598: 9593: 9588: 9583: 9578: 9573: 9568: 9563: 9557: 9555: 9549: 9548: 9541: 9540: 9533: 9526: 9518: 9509: 9508: 9496: 9493: 9492: 9488: 9487: 9485: 9484: 9482:Acyclic finite 9479: 9474: 9469: 9464: 9459: 9454: 9449: 9443: 9438: 9433: 9432:Turing Machine 9427: 9425:Linear-bounded 9422: 9417: 9415:Turing machine 9411: 9409: 9407: 9406: 9401: 9396: 9391: 9386: 9381: 9376: 9374:Tree-adjoining 9371: 9366: 9363: 9358: 9350: 9345: 9340: 9334: 9332: 9330: 9329: 9324: 9321: 9316: 9311: 9306: 9301: 9299:Tree-adjoining 9296: 9291: 9288: 9283: 9275: 9270: 9267: 9261: 9259: 9257: 9256: 9253: 9250: 9247: 9244: 9241: 9238: 9235: 9232: 9229: 9226: 9223: 9220: 9217: 9213: 9210: 9209: 9204: 9199: 9194: 9186: 9183: 9182: 9167: 9166: 9159: 9152: 9144: 9138: 9137: 9132: 9124: 9116: 9108: 9100: 9091: 9077: 9076:External links 9074: 9072: 9071: 9048: 9028:(6): 419–422. 9010: 9004: 8991: 8985: 8959: 8953: 8936: 8916: 8902: 8873: 8845: 8810: 8801: 8795: 8782: 8762: 8732: 8726: 8713: 8677: 8660: 8641: 8635: 8619: 8599: 8576: 8554: 8533:10.1.1.97.3985 8508: 8475: 8463:Aho, Alfred V. 8458: 8456: 8453: 8450: 8449: 8431: 8411:Conway, Damian 8402: 8393: 8364: 8353:perldoc perlre 8324: 8301:"A fall sweep" 8288: 8259: 8235: 8211: 8187: 8163: 8139: 8115: 8105:. 18 June 2022 8090: 8066: 8042: 8014: 7989: 7968: 7936: 7879: 7858: 7830: 7819: 7808: 7789: 7758: 7736: 7701: 7672: 7629: 7588: 7571:Regex Tutorial 7558: 7521: 7510: 7473: 7447: 7425: 7399: 7390: 7370: 7356:(1): 120–126. 7352:(in Russian). 7336: 7325: 7315: 7297: 7286: 7275:s3.boskent.com 7262: 7245: 7234: 7222: 7211: 7197: 7164: 7135: 7106: 7082: 7068: 7035: 7006: 6981: 6961:Dennis Ritchie 6948: 6936: 6921: 6894: 6876: 6868:O'Reilly Media 6864:Beautiful Code 6858:(2007-08-08). 6847: 6832: 6817: 6805: 6763: 6751: 6725: 6700: 6681: 6661: 6643: 6623: 6592: 6591: 6589: 6586: 6585: 6584: 6575: 6570: 6565: 6560: 6553: 6550: 6522:Main article: 6519: 6516: 6513: 6512: 6503: 6440: 6437: 6434: 6429: 6428: 6416: 6353: 6350: 6347: 6341: 6340: 6328: 6265: 6262: 6259: 6253: 6252: 6243: 6180: 6177: 6174: 6168: 6167: 6158: 6104: 6101: 6098: 6092: 6091: 6082: 6019: 6016: 6004: 5998: 5997: 5991: 5937: 5934: 5912: 5906: 5905: 5898:In Hello World 5896: 5833: 5830: 5829:a whitespace. 5823: 5817: 5816: 5809:In Hello World 5807: 5744: 5741: 5730: 5724: 5723: 5717: 5654: 5651: 5647: 5646: 5641:in ASCII, and 5630: 5624: 5623: 5614: 5551: 5548: 5544:Decimal_Number 5536: 5535: 5530:in ASCII, and 5523: 5517: 5516: 5510: 5456: 5453: 5444: 5438: 5437: 5428: 5374: 5371: 5368: 5362: 5361: 5352: 5298: 5295: 5292: 5287: 5286: 5280: 5217: 5214: 5188: 5182: 5181: 5175: 5112: 5109: 5106: 5100: 5099: 5093: 5030: 5027: 5008: 5002: 5001: 4995: 4932: 4929: 4926: 4920: 4919: 4913: 4859: 4856: 4853: 4847: 4846: 4840: 4812:m/(H..).(o..)/ 4786: 4783: 4762: 4756: 4755: 4746: 4692: 4689: 4684: 4678: 4677: 4674: 4671: 4620: 4596: 4593: 4581:search engines 4559:Some high-end 4546:data wrangling 4510: 4507: 4505: 4504: 4499: 4494: 4489: 4484: 4479: 4474: 4469: 4464: 4459: 4453: 4440: 4437: 4436: 4435: 4412:\p{Alphabetic} 4372:\p{IsArmenian} 4331: 4321: 4303:. Unicode has 4298: 4272: 4266: 4253:) ≤ codepoint( 4229: 4219: 4193:character sets 4184: 4181: 4165: 4160: 4157: 4154: 4151: 4147: 4143: 4138: 4112: 4107: 4104: 4101: 4098: 4094: 4090: 4085: 3961: 3958: 3912: 3911: 3901: 3900: 3886: 3874: 3870: 3869: 3857: 3845: 3841: 3840: 3837: 3834: 3827: 3824: 3755:backreferences 3744: 3741: 3735: 3732: 3683: 3672: 3659: 3656: 3626: 3601: 3598: 3581:.NET Framework 3579:, Microsoft's 3542: 3539: 3430: 3429: 3425: 3406: 3396: 3394: 3390: 3386: 3385: 3381: 3362: 3352: 3350: 3346: 3342: 3341: 3337: 3327: 3317: 3307: 3305: 3301: 3300: 3296: 3268: 3258: 3248: 3244: 3238: 3237: 3233: 3214: 3212: 3210: 3206: 3202: 3201: 3197: 3178: 3168: 3166: 3162: 3158: 3157: 3153: 3134: 3124: 3122: 3118: 3114: 3113: 3109: 3090: 3088: 3086: 3082: 3078: 3077: 3073: 3063: 3053: 3043: 3041: 3037: 3036: 3032: 3004: 2994: 2984: 2980: 2976: 2975: 2971: 2952: 2950: 2948: 2944: 2938: 2937: 2882: 2872: 2870: 2868: 2866: 2862: 2861: 2806: 2796: 2791: 2781: 2779: 2775: 2774: 2770: 2751: 2741: 2739: 2735: 2734:Space and tab 2731: 2730: 2726: 2707: 2697: 2695: 2691: 2687: 2686: 2682: 2672: 2662: 2652: 2650: 2646: 2645: 2641: 2631: 2621: 2611: 2609: 2605: 2604: 2600: 2581: 2579: 2577: 2573: 2569: 2568: 2564: 2545: 2543: 2541: 2539: 2535: 2534: 2531: 2528: 2525: 2522: 2519: 2506:abc...zABC...Z 2467: 2464: 2459: 2452: 2451: 2445: 2439: 2433: 2419: 2418: 2411: 2405: 2404: 2397: 2391: 2390: 2383: 2377: 2376: 2373: 2372:Metacharacter 2332: 2329: 2324: 2323: 2317: 2311: 2305: 2299: 2289: 2279: 2273: 2259: 2258: 2228: 2212: 2211: 2196: 2190: 2189: 2177: 2168: 2167: 2148: 2142: 2141: 2138: 2132: 2131: 2122: 2117: 2116: 2065: 2060: 2059: 2045: 2039: 2038: 2035: 2029: 2028: 2025: 2024:Metacharacter 2006:be designated 1997:metacharacters 1984: 1981: 1973:backreferences 1871: 1868: 1806:for the regex 1794:is entered as 1787: 1784: 1704: 1701: 1693:Kleene algebra 1614:(equivalent). 1596: 1593: 1545: 1540: 1539: 1527: 1517: 1514: 1511: 1505: 1501: 1498: 1495: 1492: 1489: 1486: 1483: 1480: 1477: 1474: 1471: 1468: 1465: 1462: 1459: 1456: 1449: 1444: 1440: 1436: 1433: 1430: 1427: 1411: 1394: 1391: 1388: 1385: 1382: 1379: 1376: 1373: 1370: 1367: 1364: 1361: 1358: 1355: 1352: 1349: 1344: 1340: 1336: 1333: 1330: 1327: 1315: 1291: 1230: 1227: 1226: 1225: 1219: 1213: 1207: 1176: 1175: 1168:binary strings 1137: 1116: 1094: 1093: 1076: 1067: 1050: 1047: 1030: 1027: 1012: 1011: 1010: 1009: 1003: 988: 986: 982: 981: 978: 977: 966: 958: 957: 950: 942: 941: 934: 926: 925: 918: 910: 909: 898: 890: 889: 878: 870: 869: 858: 846: 845: 805: 804:Quantification 802: 764: 761: 742: 740: 712:, specifies a 705: 704:Basic concepts 702: 486: 483: 217:regular events 193: 190: 154:search engines 78:(shortened as 50: 26: 9: 6: 4: 3: 2: 9923: 9912: 9909: 9907: 9904: 9902: 9899: 9897: 9894: 9892: 9889: 9887: 9884: 9883: 9881: 9866: 9863: 9861: 9858: 9856: 9853: 9851: 9848: 9846: 9843: 9841: 9838: 9836: 9833: 9831: 9828: 9826: 9823: 9822: 9820: 9816: 9810: 9807: 9805: 9802: 9800: 9797: 9795: 9792: 9790: 9787: 9785: 9782: 9780: 9777: 9775: 9772: 9771: 9769: 9767: 9763: 9757: 9754: 9752: 9749: 9747: 9744: 9742: 9739: 9738: 9736: 9734: 9730: 9724: 9721: 9719: 9716: 9714: 9711: 9709: 9706: 9705: 9703: 9701: 9697: 9691: 9688: 9686: 9683: 9682: 9680: 9676: 9670: 9667: 9665: 9662: 9660: 9657: 9655: 9652: 9650: 9647: 9645: 9642: 9640: 9637: 9635: 9632: 9630: 9627: 9626: 9624: 9622: 9618: 9612: 9609: 9607: 9604: 9602: 9599: 9597: 9594: 9592: 9589: 9587: 9584: 9582: 9579: 9577: 9576:Edit distance 9574: 9572: 9569: 9567: 9564: 9562: 9559: 9558: 9556: 9554: 9553:String metric 9550: 9546: 9539: 9534: 9532: 9527: 9525: 9520: 9519: 9516: 9501: 9500:proper subset 9494: 9483: 9480: 9478: 9475: 9473: 9470: 9468: 9465: 9463: 9460: 9458: 9455: 9453: 9450: 9448: 9444: 9442: 9439: 9437: 9434: 9431: 9428: 9426: 9423: 9421: 9418: 9416: 9413: 9412: 9410: 9405: 9402: 9400: 9397: 9395: 9392: 9390: 9387: 9385: 9382: 9380: 9377: 9375: 9372: 9370: 9367: 9364: 9362: 9359: 9356: 9351: 9349: 9346: 9344: 9341: 9339: 9336: 9335: 9333: 9328: 9327:Non-recursive 9325: 9322: 9320: 9317: 9315: 9312: 9310: 9307: 9305: 9302: 9300: 9297: 9295: 9292: 9289: 9287: 9284: 9281: 9276: 9274: 9271: 9268: 9266: 9263: 9262: 9260: 9254: 9251: 9248: 9245: 9242: 9239: 9236: 9233: 9230: 9227: 9224: 9221: 9218: 9215: 9214: 9212: 9211: 9208: 9205: 9203: 9200: 9198: 9195: 9193: 9190: 9189: 9184: 9180: 9176: 9172: 9165: 9160: 9158: 9153: 9151: 9146: 9145: 9142: 9136: 9133: 9131: 9130: 9125: 9123: 9122: 9117: 9115: 9114: 9109: 9107: 9106: 9101: 9099: 9095: 9092: 9089: 9084: 9080: 9079: 9061: 9057: 9053: 9049: 9045: 9041: 9036: 9031: 9027: 9023: 9019: 9015: 9014:Thompson, Ken 9011: 9007: 9001: 8997: 8992: 8988: 8982: 8978: 8974: 8969: 8964: 8960: 8956: 8950: 8946: 8942: 8937: 8927:on 2010-07-14 8926: 8922: 8917: 8913: 8909: 8905: 8899: 8895: 8891: 8887: 8883: 8879: 8874: 8860: 8853: 8852: 8846: 8842: 8838: 8833: 8828: 8824: 8820: 8816: 8811: 8807: 8802: 8798: 8792: 8788: 8783: 8769: 8765: 8759: 8755: 8751: 8747: 8740: 8739: 8733: 8729: 8723: 8719: 8714: 8704:on 2011-07-18 8703: 8698: 8693: 8689: 8685: 8684: 8678: 8667: 8663: 8657: 8653: 8649: 8648: 8642: 8638: 8632: 8628: 8624: 8620: 8610:on 2010-01-01 8609: 8605: 8600: 8589: 8585: 8581: 8577: 8566: 8562: 8561: 8555: 8551: 8547: 8543: 8539: 8534: 8529: 8526:(2): 97–113. 8525: 8521: 8514: 8509: 8498: 8494: 8493: 8485: 8481: 8476: 8472: 8468: 8464: 8460: 8459: 8438: 8434: 8428: 8424: 8420: 8416: 8412: 8406: 8397: 8390: 8386: 8384: 8379: 8378: 8377:in a Nutshell 8375: 8368: 8361: 8357: 8354: 8350: 8346: 8334: 8328: 8312: 8308: 8307: 8302: 8298: 8292: 8277: 8273: 8269: 8263: 8249: 8245: 8239: 8225: 8221: 8215: 8201: 8197: 8191: 8177: 8173: 8167: 8153: 8149: 8143: 8129: 8125: 8119: 8104: 8103:microsoft.com 8100: 8094: 8080: 8076: 8070: 8056: 8052: 8046: 8031: 8027: 8021: 8019: 8003: 7999: 7993: 7984: 7979: 7972: 7956: 7952: 7951: 7946: 7940: 7921: 7917: 7913: 7909: 7905: 7901: 7897: 7890: 7883: 7874: 7869: 7862: 7855: 7845:on 2021-08-18 7844: 7840: 7834: 7828: 7823: 7817: 7812: 7804: 7800: 7793: 7777: 7773: 7769: 7762: 7754: 7750: 7743: 7741: 7725:on 2015-02-03 7721: 7714: 7713: 7705: 7690: 7686: 7682: 7676: 7660: 7656: 7652: 7648: 7644: 7640: 7633: 7618: 7615: 7610: 7605: 7601: 7600: 7592: 7576: 7572: 7568: 7562: 7546: 7542: 7538: 7534: 7528: 7526: 7519: 7514: 7498: 7494: 7490: 7486: 7480: 7478: 7461: 7457: 7451: 7444: 7440: 7436: 7429: 7413: 7409: 7403: 7394: 7388: 7384: 7380: 7374: 7359: 7355: 7351: 7347: 7340: 7334: 7329: 7322: 7318: 7312: 7308: 7301: 7290: 7276: 7272: 7266: 7259: 7255: 7249: 7243: 7238: 7231: 7226: 7220: 7219:Sipser (1998) 7215: 7209: 7204: 7202: 7186: 7182: 7175: 7173: 7171: 7169: 7153: 7149: 7145: 7139: 7124: 7120: 7116: 7110: 7096: 7092: 7086: 7080: 7075: 7073: 7057: 7053: 7049: 7045: 7039: 7024: 7020: 7016: 7010: 6995: 6991: 6985: 6971:on 2011-06-05 6970: 6966: 6962: 6958: 6952: 6946:, p. 98. 6945: 6940: 6933: 6928: 6926: 6910:on 1999-02-21 6909: 6905: 6898: 6883: 6879: 6873: 6869: 6865: 6861: 6857: 6851: 6844: 6839: 6837: 6829: 6828:Thompson 1968 6824: 6822: 6815: 6812:Kleene 1951, 6809: 6802: 6787: 6783: 6782: 6774: 6767: 6760: 6755: 6739: 6738:howtogeek.com 6735: 6729: 6714: 6710: 6704: 6688: 6684: 6678: 6674: 6673: 6665: 6650: 6646: 6640: 6636: 6635: 6627: 6613:on 2016-11-01 6612: 6608: 6604: 6597: 6593: 6583: 6579: 6576: 6574: 6571: 6569: 6566: 6564: 6561: 6559: 6556: 6555: 6549: 6547: 6543: 6539: 6535: 6531: 6525: 6501: 6500: 6438: 6435: 6431: 6430: 6414: 6413: 6351: 6348: 6343: 6342: 6326: 6325: 6263: 6260: 6255: 6254: 6241: 6240: 6178: 6175: 6170: 6169: 6156: 6155: 6102: 6099: 6094: 6093: 6080: 6079: 6017: 6005: 6000: 5999: 5989: 5988: 5935: 5913: 5908: 5907: 5894: 5893: 5831: 5828: 5824: 5819: 5818: 5805: 5804: 5742: 5731: 5726: 5725: 5715: 5714: 5652: 5650: 5643: 5642: 5635: 5631: 5626: 5625: 5612: 5611: 5549: 5547: 5532: 5531: 5524: 5519: 5518: 5508: 5507: 5454: 5452: 5445: 5440: 5439: 5426: 5425: 5372: 5369: 5364: 5363: 5350: 5349: 5296: 5293: 5289: 5288: 5278: 5277: 5215: 5189: 5184: 5183: 5173: 5172: 5110: 5107: 5102: 5101: 5091: 5090: 5028: 5010:Modifies the 5009: 5004: 5003: 4993: 4992: 4930: 4927: 4922: 4921: 4911: 4910: 4857: 4854: 4849: 4848: 4838: 4837: 4784: 4763: 4758: 4757: 4744: 4743: 4690: 4685: 4680: 4679: 4675: 4672: 4669: 4668: 4665: 4662: 4658: 4651:, or lack of 4642: 4637: 4635: 4628: 4624: 4619: 4616: 4612: 4610: 4606: 4602: 4592: 4590: 4586: 4582: 4577: 4571: 4567: 4562: 4557: 4555: 4551: 4547: 4543: 4539: 4538:data scraping 4535: 4531: 4523: 4519: 4515: 4503: 4500: 4498: 4495: 4493: 4490: 4488: 4485: 4483: 4480: 4478: 4475: 4473: 4470: 4468: 4465: 4463: 4460: 4458: 4455: 4454: 4452: 4450: 4446: 4405: 4389: 4385: 4357: 4344: 4339: 4335: 4332: 4329: 4325: 4322: 4319: 4315: 4310: 4306: 4302: 4301:Normalization 4299: 4296: 4292: 4288: 4284: 4281:. For these, 4280: 4276: 4273: 4270: 4267: 4264: 4260: 4256: 4252: 4248: 4244: 4240: 4233: 4230: 4227: 4223: 4220: 4217: 4213: 4209: 4205: 4202: 4201: 4200: 4198: 4194: 4190: 4180: 4158: 4155: 4152: 4149: 4145: 4105: 4102: 4099: 4096: 4092: 4069: 4067: 4062: 4057: 4053: 4051: 4023: 4018: 4016: 4012: 4008: 4002: 4000: 3996: 3992: 3988: 3987: 3982: 3978: 3974: 3969: 3967: 3957: 3925: 3909: 3902: 3893: 3887: 3880: 3875: 3872: 3871: 3863: 3858: 3851: 3846: 3843: 3842: 3838: 3835: 3832: 3831: 3822: 3818: 3816: 3812: 3808: 3804: 3800: 3796: 3791: 3789: 3785: 3781: 3777: 3776:pumping lemma 3774:, due to the 3773: 3768: 3762: 3758: 3750: 3740: 3734:IETF I-Regexp 3731: 3728: 3723:only matches 3711:matches both 3701: 3681: 3671: 3665: 3655: 3650:matches only 3645: 3641: 3637: 3625: 3619: 3600:Lazy matching 3597: 3594: 3590: 3586: 3582: 3578: 3574: 3570: 3566: 3562: 3558: 3554: 3548: 3541:Perl and PCRE 3538: 3536: 3532: 3528: 3524: 3519: 3463: 3459: 3455: 3444: 3426: 3407: 3397: 3395: 3391: 3388: 3387: 3382: 3363: 3353: 3351: 3347: 3344: 3343: 3338: 3328: 3318: 3308: 3306: 3303: 3302: 3297: 3269: 3259: 3249: 3245: 3243: 3240: 3239: 3234: 3215: 3213: 3211: 3207: 3204: 3203: 3198: 3179: 3169: 3167: 3163: 3160: 3159: 3154: 3135: 3125: 3123: 3119: 3116: 3115: 3110: 3091: 3089: 3087: 3083: 3080: 3079: 3074: 3064: 3054: 3044: 3042: 3039: 3038: 3033: 3005: 2995: 2985: 2981: 2978: 2977: 2972: 2953: 2951: 2949: 2945: 2943: 2940: 2939: 2883: 2873: 2871: 2869: 2867: 2864: 2863: 2807: 2797: 2792: 2782: 2780: 2777: 2776: 2771: 2752: 2742: 2740: 2736: 2733: 2732: 2727: 2708: 2698: 2696: 2692: 2689: 2688: 2683: 2673: 2663: 2653: 2651: 2648: 2647: 2642: 2632: 2622: 2612: 2610: 2607: 2606: 2601: 2582: 2580: 2578: 2574: 2571: 2570: 2565: 2546: 2544: 2542: 2540: 2537: 2536: 2532: 2529: 2526: 2523: 2520: 2517: 2516: 2513: 2511: 2507: 2503: 2496:to uppercase 2495: 2483: 2463: 2457: 2446: 2440: 2434: 2428: 2427: 2426: 2425: 2412: 2407: 2406: 2398: 2393: 2392: 2384: 2379: 2374: 2371: 2370: 2367: 2364: 2342: 2338: 2328: 2318: 2312: 2306: 2300: 2290: 2288:except "bat". 2280: 2274: 2268: 2267: 2266: 2265: 2252: 2248: 2237: 2233: 2229: 2224: 2220: 2214: 2197: 2192: 2186: 2182: 2178: 2175: 2170: 2166: 2155: 2149: 2144: 2139: 2134: 2123: 2119: 2115: 2066: 2062: 2050: 2046: 2041: 2036: 2031: 2026: 2023: 2022: 2019: 2017: 1998: 1994: 1990: 1980: 1978: 1974: 1970: 1969:lazy matching 1957: 1955: 1936: 1931: 1906: 1904: 1900: 1896: 1892: 1888: 1884: 1880: 1877: 1867: 1853: 1841: 1837: 1833: 1829: 1813: 1801: 1783: 1757: 1749: 1745: 1740: 1737: 1735: 1730: 1726: 1718: 1714: 1710: 1700: 1698: 1694: 1690: 1686: 1681: 1677: 1672: 1670: 1666: 1662: 1658: 1654: 1651: 1647: 1643: 1639: 1635: 1631: 1628: 1624: 1620: 1615: 1613: 1609: 1605: 1600: 1592: 1590: 1586: 1580: 1578: 1574: 1569: 1567: 1562: 1560: 1556: 1552: 1548: 1525: 1515: 1512: 1509: 1503: 1496: 1493: 1490: 1484: 1478: 1475: 1472: 1463: 1460: 1457: 1447: 1442: 1434: 1431: 1428: 1418: 1417: 1416: 1414: 1406: 1389: 1386: 1383: 1374: 1371: 1368: 1359: 1356: 1353: 1347: 1342: 1334: 1331: 1328: 1314: 1310: 1306: 1302: 1298: 1294: 1287: 1286:exponentially 1283: 1278: 1276: 1272: 1268: 1264: 1260: 1220: 1214: 1208: 1202: 1201: 1200: 1199: 1195: 1169: 1160: 1156: 1152: 1144: 1143: 1138: 1131: 1123: 1122: 1117: 1105: 1104: 1103:concatenation 1099: 1098: 1097: 1091: 1083: 1082: 1077: 1074: 1073: 1068: 1065: 1061: 1060: 1059: 1057: 1046: 1044: 1040: 1036: 1026: 1024: 1023:§ Syntax 1020: 1015: 1004: 998: 997: 994: 990:The wildcard 989: 987: 984: 983: 975: 971: 967: 964: 960: 959: 955: 951: 948: 944: 943: 939: 935: 932: 928: 927: 923: 919: 916: 912: 911: 903: 899: 896: 892: 891: 883: 879: 876: 872: 871: 863: 859: 856: 852: 851: 848: 847: 843: 836: 832: 825: 818: 817:question mark 814: 810: 806: 803: 772: 768: 765: 762: 747: 743: 741: 738: 737: 736: 733: 727: 719: 715: 711: 701: 699: 695: 691: 687: 681: 674: 670: 665: 661: 657: 653: 649: 645: 637: 633: 628: 624: 620: 616: 545: 540: 538: 530: 525: 523: 520: 500: 499:metacharacter 496: 492: 482: 481: 477: 473: 469: 465: 461: 457: 453: 449: 445: 440: 438: 434: 430: 426: 421: 416: 412: 407: 403: 399: 394: 392: 388: 384: 380: 379:mini-language 376: 372: 368: 364: 360: 356: 352: 348: 347:Henry Spencer 344: 339: 337: 333: 329: 325: 321: 317: 313: 309: 305: 300: 298: 294: 288: 282: 278: 274: 270: 266: 262: 258: 254: 248: 246: 242: 238: 234: 230: 226: 222: 218: 214: 210: 202: 198: 189: 187: 186:many of these 183: 179: 175: 171: 167: 163: 159: 155: 150: 148: 144: 140: 136: 132: 128: 123: 121: 117: 113: 109: 105: 101: 97: 96:match pattern 93: 89: 85: 81: 77: 68: 48: 42: 35: 34: 19: 9779:Suffix array 9699: 9685:Aho–Corasick 9596:Lee distance 9436:Nested stack 9379:Context-free 9304:Context-free 9265:Unrestricted 9128: 9120: 9112: 9104: 9064:. Retrieved 9025: 9021: 8998:. O'Reilly. 8995: 8972: 8940: 8929:. Retrieved 8925:the original 8877: 8866:. Retrieved 8850: 8822: 8818: 8805: 8789:. Springer. 8786: 8775:. Retrieved 8745: 8737: 8717: 8706:. Retrieved 8702:the original 8687: 8682: 8670:. Retrieved 8646: 8626: 8612:. Retrieved 8608:the original 8592:. Retrieved 8583: 8569:. Retrieved 8559: 8523: 8519: 8501:. Retrieved 8491: 8470: 8441:. Retrieved 8418: 8405: 8396: 8381: 8372: 8367: 8327: 8315:. Retrieved 8304: 8291: 8280:. Retrieved 8271: 8262: 8251:. Retrieved 8247: 8238: 8227:. Retrieved 8223: 8214: 8203:. Retrieved 8199: 8190: 8179:. Retrieved 8176:v2.ocaml.org 8175: 8166: 8155:. Retrieved 8151: 8142: 8131:. Retrieved 8127: 8118: 8107:. Retrieved 8102: 8093: 8082:. Retrieved 8078: 8069: 8058:. Retrieved 8054: 8045: 8034:. Retrieved 8006:. Retrieved 7992: 7971: 7959:. Retrieved 7948: 7939: 7927:. Retrieved 7899: 7895: 7882: 7861: 7853: 7847:. Retrieved 7843:the original 7833: 7822: 7811: 7802: 7792: 7780:. Retrieved 7771: 7761: 7752: 7727:. Retrieved 7720:the original 7711: 7704: 7693:. Retrieved 7684: 7675: 7663:. Retrieved 7646: 7642: 7632: 7620:. Retrieved 7598: 7591: 7579:. Retrieved 7570: 7561: 7549:. Retrieved 7536: 7513: 7501:. Retrieved 7497:the original 7488: 7464:. Retrieved 7450: 7442: 7438: 7428: 7418:December 10, 7416:. Retrieved 7411: 7402: 7393: 7386: 7382: 7378: 7373: 7362:. Retrieved 7353: 7349: 7339: 7333:Kozen (1991) 7328: 7320: 7306: 7300: 7289: 7278:. Retrieved 7274: 7265: 7248: 7237: 7225: 7214: 7188:. Retrieved 7184: 7156:. Retrieved 7147: 7138: 7127:. Retrieved 7118: 7109: 7098:. Retrieved 7095:www.pcre.org 7094: 7085: 7060:. Retrieved 7051: 7038: 7027:. Retrieved 7018: 7009: 6998:. Retrieved 6984: 6973:. Retrieved 6969:the original 6951: 6939: 6912:. Retrieved 6908:the original 6897: 6886:. Retrieved 6863: 6850: 6808: 6800: 6793:. Retrieved 6786:the original 6779: 6766: 6754: 6742:. Retrieved 6737: 6728: 6716:. Retrieved 6712: 6703: 6691:. Retrieved 6671: 6664: 6653:. Retrieved 6633: 6626: 6615:. Retrieved 6611:the original 6606: 6596: 6541: 6527: 6498: 6497: 6411: 6410: 6323: 6322: 6238: 6237: 6153: 6152: 6077: 6076: 6015:in Unicode. 6011:in ASCII or 5986: 5985: 5891: 5890: 5826: 5802: 5801: 5712: 5711: 5649:in Unicode. 5648: 5633: 5609: 5608: 5537: 5505: 5504: 5447: 5423: 5422: 5347: 5346: 5275: 5274: 5170: 5169: 5088: 5087: 4990: 4989: 4908: 4907: 4835: 4834: 4741: 4740: 4673:Description 4663: 4638: 4631: 4627:substitution 4626: 4622: 4617: 4613: 4598: 4578: 4558: 4542:web scraping 4540:(especially 4527: 4442: 4403: 4387: 4383: 4368:\p{Armenian} 4355: 4333: 4323: 4317: 4313: 4308: 4300: 4274: 4268: 4254: 4250: 4242: 4238: 4231: 4221: 4203: 4186: 4070: 4058: 4054: 4022:backtracking 4019: 4010: 4006: 4003: 3998: 3994: 3990: 3985: 3980: 3970: 3963: 3926: 3915: 3891: 3878: 3861: 3849: 3820: 3810: 3809:, or simply 3806: 3802: 3794: 3792: 3772:context-free 3769: 3760: 3752: 3746: 3737: 3729: 3705:(?>group) 3702: 3675: 3663: 3661: 3643: 3639: 3635: 3629: 3603: 3550: 3534: 3530: 3526: 3522: 3520: 3461: 3457: 3445: 3433: 2518:Description 2509: 2505: 2497: 2489: 2484: 2469: 2456:command line 2453: 2423: 2422: 2375:Description 2362: 2340: 2334: 2325: 2263: 2262: 2250: 2246: 2235: 2231: 2222: 2218: 2184: 2180: 2173: 2158: 2153: 2080: 2027:Description 2018:) does not. 2015: 1992: 1986: 1958: 1929: 1928:, which are 1907: 1902: 1890: 1886: 1882: 1873: 1789: 1768:{}()^$ .|*+? 1741: 1733: 1728: 1724: 1716: 1712: 1708: 1706: 1689:Dexter Kozen 1673: 1668: 1664: 1660: 1656: 1652: 1649: 1645: 1641: 1637: 1633: 1629: 1626: 1622: 1618: 1616: 1601: 1598: 1584: 1581: 1570: 1563: 1543: 1541: 1409: 1407: 1318:is given by 1312: 1308: 1304: 1300: 1296: 1289: 1279: 1270: 1266: 1262: 1232: 1197: 1196: 1177: 1154: 1140: 1128:denotes the 1119: 1101: 1095: 1089: 1079: 1072:empty string 1070: 1063: 1052: 1032: 1017:The precise 1016: 1013: 992: 973: 969: 962: 953: 946: 937: 930: 921: 914: 901: 894: 882:zero or more 881: 874: 861: 854: 746:vertical bar 739:Boolean "or" 734: 725: 722:H(ä|ae?)ndel 709: 707: 697: 693: 685: 679: 672: 668: 643: 641: 635: 631: 541: 526: 494: 490: 488: 479: 441: 425:Philip Hazel 422: 395: 354: 340: 306:programs at 301: 286: 271:code on the 253:Ken Thompson 249: 243:include the 216: 206: 162:text editors 151: 124: 87: 83: 79: 75: 73: 66: 65:(lower case 32: 9789:Suffix tree 9445:restricted 9052:Wall, Larry 8306:Google Blog 8224:www.php.net 7961:21 November 7929:21 November 7782:24 November 7581:24 November 7551:23 December 7144:"CUDA grep" 7079:Wall (2002) 7044:Wall, Larry 6944:Aycock 2003 6759:Kleene 1951 6744:24 February 6718:24 February 6505:Hello World 6245:Hello World 6160:Hello World 5430:Hello World 5354:Hello World 4748:Hello World 4655:instead of 4518:A blacklist 4364:\P{Block=X} 4352:\p{Block=X} 4247:code points 3953:(?<!...) 3947:(?<=...) 3788:NP-complete 3778:. However, 3698:"Ganymede," 3652:"Ganymede," 3632:"Ganymede," 3040:Non-digits 2794:\< \> 2510:aAbBcC...zZ 1848:/re1/,/re2/ 1729:quantifiers 1697:Horn clause 1687:. In 1991, 1676:Kleene star 1520: times 1142:Kleene star 1121:alternation 902:one or more 862:zero or one 842:Kleene plus 833:), and the 831:Kleene star 767:Parentheses 690:recursively 627:Kleene star 623:Translating 529:text editor 489:The phrase 9880:Categories 9066:2006-10-11 8945:Wrox Press 8931:2009-04-01 8868:2017-12-10 8777:2011-02-03 8708:2009-06-15 8672:2005-04-26 8623:Forta, Ben 8614:2008-04-27 8594:2011-12-13 8571:2011-12-13 8503:2013-12-14 8455:References 8443:2017-09-10 8380:, p. 213; 8371:E.g., see 8282:2023-02-24 8253:2023-02-24 8229:2023-02-04 8205:2023-02-04 8181:2022-08-21 8157:2022-04-27 8133:2022-04-27 8109:2024-02-20 8084:2022-04-27 8060:2022-04-27 8036:2010-02-05 8008:2013-09-25 7983:1903.05896 7849:2022-02-12 7816:Cox (2007) 7729:2022-09-05 7695:2019-11-21 7665:2015-07-03 7503:10 October 7466:January 8, 7364:2018-03-28 7280:2024-02-21 7190:31 January 7158:2019-10-22 7129:2019-10-22 7100:2024-04-07 7062:2006-10-10 7029:2013-10-12 7019:PostgreSQL 7000:2013-10-11 6975:2009-02-17 6888:2013-05-15 6655:2016-07-25 6617:2016-10-31 5933:property. 5919:in ASCII; 5632:Matches a 5540:Alphabetic 5056:m/(l.+?o)/ 4570:small caps 4477:JavaScript 4279:Devanagari 3966:algorithms 3839:Lookahead 3836:Lookbehind 3826:Assertions 3815:Larry Wall 3709:^(wi|w)i$ 3684:, because 3664:possessive 3585:XML Schema 3561:JavaScript 3545:See also: 2353:\{ \} 2345:\( \) 2162:\( \) 1979:patterns. 1895:deprecated 1800:delimiters 1786:Delimiters 1736:quantifier 1680:set unions 1612:isomorphic 1259:complement 809:quantifier 460:ECMAScript 427:developed 420:operator. 383:Raku rules 367:PostgreSQL 261:text files 211:described 92:characters 9399:Star-free 9353:Positive 9343:Decidable 9278:Positive 9202:Languages 8894:1813/6963 8697:0802.2869 8528:CiteSeerX 8391:, p. 106. 8345:delimiter 8272:Crates.io 7873:1308.3822 7439:swtch.com 7252:Based on 7119:grovf.com 6914:9 October 6795:13 August 6518:Induction 6460:$ string1 6442:$ string1 6373:$ string1 6355:$ string1 6285:$ string1 6267:$ string1 6200:$ string1 6182:$ string1 6124:$ string1 6106:$ string1 6039:$ string1 6021:$ string1 6013:\P{Digit} 5957:$ string1 5939:$ string1 5923:\p{Digit} 5859:m/\S.*\S/ 5853:$ string1 5835:$ string1 5770:m/\s.*\s/ 5764:$ string1 5746:$ string1 5674:$ string1 5656:$ string1 5571:$ string1 5553:$ string1 5476:$ string1 5458:$ string1 5394:$ string1 5376:$ string1 5318:$ string1 5300:$ string1 5243:m/l{1,2}/ 5237:$ string1 5219:$ string1 5132:$ string1 5114:$ string1 5050:$ string1 5032:$ string1 4952:$ string1 4934:$ string1 4879:$ string1 4861:$ string1 4806:$ string1 4788:$ string1 4712:$ string1 4694:$ string1 4548:, simple 4522:Knowledge 4449:libraries 4400:\p{GC=Lu} 4125:time and 4052:(ReDoS). 3873:Negative 3844:Positive 3833:Assertion 3644:reluctant 3462:word-head 2424:Examples: 2264:Examples: 1977:recursive 1956:regexes. 1604:algorithm 1513:− 1504:⏟ 1494:∣ 1485:⋯ 1476:∣ 1461:∣ 1443:∗ 1432:∣ 1387:∣ 1372:∣ 1357:∣ 1343:∗ 1332:∣ 1198:Examples: 1188:a|(b(c*)) 1130:set union 1064:empty set 963:{min,max} 835:plus sign 775:gray|grey 771:operators 308:Bell Labs 267:(JIT) to 176:, and in 110:, or for 9197:Grammars 9060:Archived 9054:(2002). 9044:21260384 9016:(1968). 8965:(1998). 8912:19875225 8859:Archived 8841:17253809 8768:Archived 8666:Archived 8652:O'Reilly 8629:. Sams. 8625:(2004). 8588:Archived 8565:Archived 8550:15345671 8497:Archived 8482:(1992). 8437:Archived 8423:O'Reilly 8413:(2005). 8356:Archived 8351:". See ' 8311:Archived 8276:Archived 8196:"perlre" 8055:man7.org 8030:Archived 8002:Archived 7955:Archived 7920:Archived 7776:Archived 7689:Archived 7659:Archived 7622:11 March 7575:Archived 7545:Archived 7460:Archived 7358:Archived 7185:man7.org 7152:Archived 7123:Archived 7056:Archived 7046:(2006). 7023:Archived 6994:Archived 6963:(2003). 6882:Archived 6687:Archived 6649:Archived 6552:See also 6379:m/d\n\z/ 6206:m/rld$ / 6008:same as 5963:m/(\d+)/ 5916:same as 5638:same as 5527:same as 5482:m/llo\b/ 5207:x* y+ z? 4718:m/...../ 4676:Example 4609:versions 4595:Examples 4420:\p{Dash} 4416:\p{Math} 4340:and the 4295:katakana 4291:hiragana 3993:in time 2524:Perl/Tcl 2357:{ } 2349:( ) 2049:newlines 2004:{ } 2000:( ) 1965:{ } 1961:( ) 1930:required 1926:{ } 1922:( ) 1862:with an 1814:, where 1802:, as in 1764:{ } 1760:( ) 1721:( ) 1707:A regex 1555:grammars 1303:} whose 1216:ab*(c|ε) 1151:superset 1056:alphabet 985:Wildcard 824:asterisk 763:Grouping 718:elements 684:, where 544:globbing 533:serialie 522:keyboard 485:Patterns 398:ISO SGML 299:design. 297:compiler 269:IBM 7094 149:syntax. 139:syntaxes 122:theory. 9855:Sorting 9825:Parsing 9545:Strings 9420:Decider 9394:Regular 9361:Indexed 9319:Regular 9286:Indexed 8469:(ed.). 7916:3175806 6959:citing 6693:25 July 6499:Output: 6412:Output: 6324:Output: 6239:Output: 6154:Output: 6078:Output: 5987:Output: 5892:Output: 5803:Output: 5713:Output: 5610:Output: 5506:Output: 5424:Output: 5348:Output: 5276:Output: 5171:Output: 5138:m/el*o/ 5089:Output: 4991:Output: 4958:m/H.?e/ 4909:Output: 4836:Output: 4742:Output: 4605:library 4589:Exalead 4550:parsing 4360:\P{InX} 4348:\p{InX} 4197:Unicode 4183:Unicode 4176:⁠ 4127:⁠ 4123:⁠ 4074:⁠ 3941:(?!...) 3935:(?=...) 3896:pattern 3882:pattern 3865:pattern 3853:pattern 3811:pattern 3761:squares 3640:minimal 2979:Digits 2448:cat|dog 2415:abc|def 2355:is now 2347:is now 2337:escaped 1987:In the 1950:grep -P 1946:grep -G 1942:grep -E 1744:literal 1709:pattern 1648:) and ( 1625:) and ( 1585:regexes 1557:of the 1265:; here 976:times. 956:times. 924:times. 866:colou?r 726:matches 710:pattern 495:regexes 381:called 353:called 336:POSIX.2 192:History 184:", and 108:strings 9472:Finite 9404:Finite 9249:Type-3 9240:Type-2 9222:Type-1 9216:Type-0 9098:Curlie 9042:  9002:  8983:  8951:  8910:  8900:  8839:  8793:  8760:  8724:  8658:  8633:  8548:  8530:  8429:  8383:Python 7950:GitHub 7914:  7803:GitHub 7753:GitHub 7541:Oracle 7313:  7052:perlre 6874:  6679:  6641:  6291:m/\AH/ 6130:m/^He/ 4497:Python 4430:, and 4418:, and 4392:\p{Lu} 4216:UTF-32 4212:UTF-16 3807:regexp 3765:(.+)\1 3618:greedy 3616:) are 3583:, and 3569:Python 3419:XDigit 2533:ASCII 2240:a{3,5} 1952:" for 1916:, and 1856:s,/,X, 1826:as in 1824:g/re/p 1776:dswDSW 1734:greedy 1713:string 1703:Syntax 1587:. See 1249:, and 1210:(a|b)* 1186:, and 1159:closed 1019:syntax 947:{,max} 931:{min,} 822:, the 648:string 456:Python 444:lexers 330:, and 322:, and 245:SNOBOL 182:engine 84:regexp 33:Re:Gex 9818:Other 9774:DAFSA 9741:BLAST 9430:PTIME 9088:Regex 9040:S2CID 8977:31–90 8908:S2CID 8862:(PDF) 8855:(PDF) 8837:S2CID 8771:(PDF) 8742:(PDF) 8692:arXiv 8546:S2CID 8516:(PDF) 8487:(PDF) 8317:4 May 7978:arXiv 7923:(PDF) 7912:S2CID 7892:(PDF) 7868:arXiv 7723:(PDF) 7716:(PDF) 6789:(PDF) 6776:(PDF) 6588:Notes 6484:print 6475:print 6421:World 6418:Hello 6397:print 6388:print 6333:World 6330:Hello 6309:print 6300:print 6224:print 6215:print 6139:print 6063:print 6054:print 6045:m/\D/ 5972:print 5877:print 5868:print 5788:print 5779:print 5698:print 5689:print 5680:m/\W/ 5595:print 5586:print 5577:m/\w/ 5491:print 5409:print 5333:print 5261:print 5252:print 5202:{0,N} 5186:{M,N} 5156:print 5147:print 5074:print 5065:print 5024:{M,N} 4976:print 4967:print 4894:print 4885:m/l+/ 4821:print 4727:print 4657:POSIX 4645:\( \) 4634:POSIX 4623:match 4603:, or 4482:OCaml 4443:Most 4398:, or 4380:\p{X} 4374:, or 4245:have 4214:, or 4208:UTF-8 4189:ASCII 4066:agrep 3879:<! 3850:<= 3803:regex 3680:does 3678:".*+" 3648:".+?" 3589:Boost 3565:Julia 3375:Upper 3281:Space 3227:Punct 3191:Print 3147:Lower 3103:Graph 3017:Digit 2965:Cntrl 2764:Blank 2720:Alpha 2594:Alnum 2558:ASCII 2521:POSIX 2508:, or 2458:flag 2208:(ab)* 1989:POSIX 1879:POSIX 1836:Linux 1725:atoms 1717:atoms 1589:below 1255:(a|ε) 1192:a|bc* 1180:(ab)c 1134:(R|S) 1126:(R|S) 813:token 519:ASCII 493:, or 332:Emacs 164:, in 143:POSIX 80:regex 9809:Trie 9799:Rope 9177:and 9000:ISBN 8981:ISBN 8949:ISBN 8898:ISBN 8791:ISBN 8758:ISBN 8722:ISBN 8720:. . 8656:ISBN 8631:ISBN 8427:ISBN 8374:Java 8333:Perl 8319:2019 7963:2019 7931:2019 7784:2019 7624:2024 7617:9485 7583:2019 7553:2016 7505:2015 7468:2012 7420:2023 7311:ISBN 7192:2023 6916:2013 6872:ISBN 6814:pg46 6797:2019 6746:2024 6720:2024 6695:2016 6677:ISBN 6639:ISBN 5324:m/+/ 5198:{M,} 4647:vs. 4641:Perl 4587:and 4574:{4,} 4509:Uses 4502:Rust 4487:Perl 4472:Java 4358:and 4338:Perl 4318:some 4293:and 4259:gawk 4241:and 3950:and 3938:and 3927:The 3920:and 3908:Perl 3715:and 3694:"*+" 3668:".*" 3636:lazy 3622:".+" 3612:and 3591:and 3573:Ruby 3557:Java 3553:Perl 3473:and 3460:and 3458:word 2925:)(?= 2913:< 2910:)|(? 2901:)(?= 2889:< 2849:)(?= 2837:< 2834:)|(? 2825:)(?= 2813:< 2530:Java 2401:ab+c 2387:ab?c 2351:and 2308:at$ 2200:ab*c 2112:abc] 2108:abc] 2081:The 2012:\{\} 2010:and 2008:\(\) 2002:and 1963:and 1954:Perl 1938:grep 1924:and 1876:IEEE 1874:The 1852:Perl 1832:Unix 1828:grep 1820:/re/ 1804:/re/ 1796:"re" 1778:and 1770:and 1762:and 1678:and 1566:ISBN 1237:and 1204:a|b* 1172:(R*) 1164:(R*) 1147:(R*) 1113:(RS) 1108:(RS) 1006:a.*b 906:ab+c 886:ab*c 777:and 757:grey 751:gray 625:the 585:*)?| 472:PCRE 464:FPGA 454:and 452:Java 435:and 429:PCRE 418:LIKE 411:glob 371:Raku 343:Perl 324:expr 304:Unix 281:grep 172:and 160:and 147:Perl 135:Unix 118:and 100:text 59:/r+/ 53:Blue 9096:at 9030:doi 8890:hdl 8882:doi 8827:doi 8750:doi 8538:doi 8389:PHP 8337:m// 8152:MDN 7904:doi 7651:doi 7614:RFC 7604:doi 6542:not 6536:in 6466:m// 5925:or 5827:but 5634:non 5194:{M} 5022:or 4772:$ 2 4768:$ 1 4760:( ) 4661:). 4544:), 4520:on 4492:PHP 4462:C++ 4404:not 4362:or 4350:or 4309:may 4263:Vim 4015:re2 3906:in 3725:wii 3717:wii 3686:.*+ 3642:or 3638:or 3593:PHP 3500:or 3454:Vim 3287:or 3023:or 2527:Vim 2442:+at 2436:*at 2430:?at 2341:ERE 2320:s.* 2302:^at 2296:.at 2286:.at 2270:.at 2146:( ) 2053:a.c 2016:ERE 1993:BRE 1935:GNU 1891:SRE 1887:ERE 1883:BRE 1840:sed 1247:aa* 1184:abc 1037:in 1000:a.b 974:max 970:min 954:max 938:min 915:{n} 714:set 700:). 612:+)? 600:+)( 555:+$ 476:CPU 468:GPU 433:PHP 415:SQL 406:DTD 387:BNF 363:DFA 359:NFA 351:Tcl 320:AWK 316:sed 312:lex 257:QED 174:AWK 170:sed 98:in 82:or 9882:: 9173:: 9058:. 9038:. 9026:11 9024:. 9020:. 8979:. 8971:. 8947:. 8943:. 8906:. 8896:. 8888:. 8835:. 8823:11 8821:. 8817:. 8766:. 8756:. 8744:. 8686:. 8664:. 8654:. 8650:. 8582:. 8544:. 8536:. 8524:35 8522:. 8518:. 8495:. 8489:. 8435:. 8421:. 8417:. 8341:// 8309:. 8303:. 8274:. 8270:. 8246:. 8222:. 8198:. 8174:. 8150:. 8126:. 8101:. 8077:. 8053:. 8028:. 8017:^ 7947:. 7918:. 7910:. 7900:31 7898:. 7894:. 7852:. 7801:. 7774:. 7770:. 7751:. 7739:^ 7687:. 7683:. 7657:. 7647:14 7645:. 7641:. 7612:. 7573:. 7569:. 7543:. 7539:. 7535:. 7524:^ 7491:. 7487:. 7476:^ 7441:. 7437:. 7410:. 7354:16 7348:. 7319:. 7273:. 7200:^ 7183:. 7167:^ 7150:. 7146:. 7121:. 7117:. 7093:. 7071:^ 7054:. 7050:. 7021:. 7017:. 6992:. 6924:^ 6880:. 6866:. 6862:. 6835:^ 6820:^ 6799:. 6778:. 6736:. 6711:. 6685:. 6647:. 6605:. 6463:=~ 6454:if 6376:=~ 6367:if 6345:\z 6288:=~ 6279:if 6257:\A 6203:=~ 6194:if 6172:$ 6127:=~ 6118:if 6042:=~ 6033:if 6002:\D 5960:=~ 5951:if 5910:\d 5856:=~ 5847:if 5821:\S 5767:=~ 5758:if 5728:\s 5677:=~ 5668:if 5628:\W 5574:=~ 5565:if 5521:\w 5479:=~ 5470:if 5451:. 5442:\b 5397:=~ 5388:if 5321:=~ 5312:if 5240:=~ 5231:if 5213:. 5135:=~ 5126:if 5053:=~ 5044:if 5018:, 5014:, 4955:=~ 4946:if 4882:=~ 4873:if 4809:=~ 4800:if 4782:. 4780:\2 4778:, 4776:\1 4770:, 4715:=~ 4706:if 4653:\d 4649:() 4611:. 4536:, 4467:C# 4426:, 4414:, 4410:, 4394:, 4370:, 4036:aa 4011:mn 3922:$ 3889:(? 3877:(? 3860:(? 3848:(? 3805:, 3767:. 3719:, 3713:wi 3654:. 3608:, 3577:Qt 3575:, 3571:, 3567:, 3563:, 3559:, 3537:. 3437:ab 3265:_s 2886:(? 2810:(? 2462:. 2460:-E 2292:at 2282:at 2276:at 2257:. 2245:\{ 2136:$ 2114:. 2110:, 2095:, 2092:, 2079:. 1971:, 1912:, 1812:ed 1808:re 1792:re 1782:. 1727:; 1636:, 1579:. 1561:. 1405:. 1253:= 1251:a? 1245:= 1243:a+ 1145:) 1124:) 1106:) 1084:) 1045:. 1025:. 844:). 807:A 780:gr 744:A 675:*) 642:A 638:") 615:. 570:+( 561:?( 552:+| 524:. 503:b. 466:, 439:. 328:vi 318:, 314:, 289:/p 287:re 285:g/ 277:ed 74:A 9537:e 9530:t 9523:v 9365:— 9323:— 9290:— 9255:— 9252:— 9246:— 9243:— 9237:— 9234:— 9231:— 9228:— 9225:— 9219:— 9163:e 9156:t 9149:v 9069:. 9046:. 9032:: 9008:. 8989:. 8957:. 8934:. 8914:. 8892:: 8884:: 8871:. 8843:. 8829:: 8799:. 8780:. 8752:: 8730:. 8711:. 8694:: 8675:. 8639:. 8617:. 8597:. 8574:. 8552:. 8540:: 8506:. 8446:. 8321:. 8285:. 8256:. 8232:. 8208:. 8184:. 8160:. 8136:. 8112:. 8087:. 8063:. 8039:. 8011:. 7986:. 7980:: 7965:. 7933:. 7906:: 7876:. 7870:: 7805:. 7786:. 7755:. 7732:. 7698:. 7668:. 7653:: 7626:. 7606:: 7585:. 7555:. 7507:. 7470:. 7422:. 7367:. 7283:. 7260:. 7194:. 7161:. 7132:. 7103:. 7065:. 7032:. 7003:. 6978:. 6918:. 6891:. 6845:. 6830:. 6761:. 6748:. 6722:. 6697:. 6658:. 6620:. 6493:} 6490:; 6481:; 6472:{ 6469:) 6457:( 6451:; 6445:= 6406:} 6403:; 6394:; 6385:{ 6382:) 6370:( 6364:; 6358:= 6318:} 6315:; 6306:; 6297:{ 6294:) 6282:( 6276:; 6270:= 6233:} 6230:; 6221:; 6212:{ 6209:) 6197:( 6191:; 6185:= 6148:} 6145:; 6136:{ 6133:) 6121:( 6115:; 6109:= 6096:^ 6072:} 6069:; 6060:; 6051:{ 6048:) 6036:( 6030:; 6024:= 5981:} 5978:; 5969:{ 5966:) 5954:( 5948:; 5942:= 5886:} 5883:; 5874:; 5865:{ 5862:) 5850:( 5844:; 5838:= 5797:} 5794:; 5785:; 5776:{ 5773:) 5761:( 5755:; 5749:= 5707:} 5704:; 5695:; 5686:{ 5683:) 5671:( 5665:; 5659:= 5604:} 5601:; 5592:; 5583:{ 5580:) 5568:( 5562:; 5556:= 5500:} 5497:; 5488:{ 5485:) 5473:( 5467:; 5461:= 5418:} 5415:; 5406:{ 5403:) 5391:( 5385:; 5379:= 5366:| 5342:} 5339:; 5330:{ 5327:) 5315:( 5309:; 5303:= 5270:} 5267:; 5258:; 5249:{ 5246:) 5234:( 5228:; 5222:= 5165:} 5162:; 5153:; 5144:{ 5141:) 5129:( 5123:; 5117:= 5104:* 5083:} 5080:; 5071:; 5062:{ 5059:) 5047:( 5041:; 5035:= 5020:? 5016:+ 5012:* 5006:? 4985:} 4982:; 4973:; 4964:{ 4961:) 4949:( 4943:; 4937:= 4924:? 4903:} 4900:; 4891:{ 4888:) 4876:( 4870:; 4864:= 4851:+ 4830:} 4827:; 4818:{ 4815:) 4803:( 4797:; 4791:= 4736:} 4733:; 4724:{ 4721:) 4709:( 4703:; 4697:= 4682:. 4457:C 4434:. 4388:X 4384:X 4356:X 4255:y 4251:x 4243:y 4239:x 4164:) 4159:1 4156:+ 4153:k 4150:2 4146:n 4142:( 4137:O 4111:) 4106:2 4103:+ 4100:k 4097:2 4093:n 4089:( 4084:O 4045:b 4042:* 4039:) 4033:| 4030:a 4027:( 4009:( 4007:O 3999:n 3997:( 3995:O 3991:n 3986:O 3981:m 3918:^ 3898:) 3892:! 3884:) 3867:) 3862:= 3855:) 3690:" 3614:? 3610:+ 3606:* 3515:* 3512:] 3509:_ 3506:] 3503:_ 3497:* 3494:w 3491:\ 3488:h 3485:\ 3479:h 3476:\ 3470:w 3467:\ 3440:] 3422:} 3416:{ 3413:p 3410:\ 3403:x 3400:\ 3378:} 3372:{ 3369:p 3366:\ 3359:u 3356:\ 3334:S 3331:\ 3324:S 3321:\ 3314:S 3311:\ 3293:s 3290:\ 3284:} 3278:{ 3275:p 3272:\ 3262:\ 3255:s 3252:\ 3230:} 3224:{ 3221:p 3218:\ 3194:} 3188:{ 3185:p 3182:\ 3175:p 3172:\ 3150:} 3144:{ 3141:p 3138:\ 3131:l 3128:\ 3106:} 3100:{ 3097:p 3094:\ 3070:D 3067:\ 3060:D 3057:\ 3050:D 3047:\ 3029:d 3026:\ 3020:} 3014:{ 3011:p 3008:\ 3001:d 2998:\ 2991:d 2988:\ 2968:} 2962:{ 2959:p 2956:\ 2934:) 2931:w 2928:\ 2922:w 2919:\ 2916:= 2907:W 2904:\ 2898:W 2895:\ 2892:= 2879:B 2876:\ 2858:) 2855:W 2852:\ 2846:w 2843:\ 2840:= 2831:w 2828:\ 2822:W 2819:\ 2816:= 2803:b 2800:\ 2788:b 2785:\ 2767:} 2761:{ 2758:p 2755:\ 2748:s 2745:\ 2723:} 2717:{ 2714:p 2711:\ 2704:a 2701:\ 2679:W 2676:\ 2669:W 2666:\ 2659:W 2656:\ 2638:w 2635:\ 2628:w 2625:\ 2618:w 2615:\ 2597:} 2591:{ 2588:p 2585:\ 2561:} 2555:{ 2552:p 2549:\ 2500:Z 2492:a 2479:d 2476:\ 2409:| 2395:+ 2381:? 2363:n 2361:\ 2314:\ 2255:} 2253:\ 2251:n 2249:, 2247:m 2236:n 2232:m 2225:} 2223:n 2221:, 2219:m 2217:{ 2204:* 2194:* 2185:n 2181:n 2174:n 2172:\ 2165:. 2154:n 2152:\ 2104:^ 2100:] 2087:^ 2083:- 2043:. 2033:^ 1918:| 1914:+ 1910:? 1864:X 1860:/ 1816:/ 1780:N 1772:\ 1752:\ 1669:F 1667:= 1665:E 1661:b 1659:, 1657:a 1653:b 1650:a 1646:b 1644:+ 1642:a 1638:Y 1634:X 1630:Y 1627:X 1623:Y 1621:+ 1619:X 1546:k 1544:L 1526:. 1516:1 1510:k 1500:) 1497:b 1491:a 1488:( 1482:) 1479:b 1473:a 1470:( 1467:) 1464:b 1458:a 1455:( 1448:a 1439:) 1435:b 1429:a 1426:( 1412:k 1410:L 1393:) 1390:b 1384:a 1381:( 1378:) 1375:b 1369:a 1366:( 1363:) 1360:b 1354:a 1351:( 1348:a 1339:) 1335:b 1329:a 1326:( 1316:4 1313:L 1309:a 1305:k 1301:b 1299:, 1297:a 1292:k 1290:L 1271:R 1267:R 1239:+ 1235:? 1155:R 1139:( 1118:( 1100:( 1092:. 1090:a 1086:a 1078:( 1069:( 1062:( 993:. 922:n 895:+ 875:* 855:? 840:( 838:+ 827:* 820:? 798:y 795:) 792:e 789:| 786:a 783:( 754:| 698:s 696:( 694:N 686:s 682:* 680:s 673:s 671:( 669:N 636:s 632:s 630:( 609:d 606:\ 603:? 597:d 594:\ 591:. 588:\ 582:d 579:\ 576:. 573:\ 567:d 564:\ 549:^ 514:b 507:. 480:. 361:/ 67:r 62:g 43:. 36:. 20:)

Index

Regular expressions
Re:Gex
Pointer (computer science) § Pointer-to-member

characters
match pattern
text
string-searching algorithms
strings
input validation
theoretical computer science
formal language
Stephen Cole Kleene
regular language
Unix
syntaxes
POSIX
Perl
search engines
word processors
text editors
text processing
sed
AWK
lexical analysis
engine
many of these

Stephen Cole Kleene
Stephen Cole Kleene

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.