Position Anchors
| Anchor | Matches position at |
|---|---|
| ^ | start of the string |
| $ | end of the string |
| \b | boundary of word (start or end) |
| \B | all positions not \b |
Empty-string match: repetition operators not applicable.
| Anchor | Matches position at |
|---|---|
| ^ | start of the string |
| $ | end of the string |
| \b | boundary of word (start or end) |
| \B | all positions not \b |
Empty-string match: repetition operators not applicable.
| Notation | Sample | Matches |
|---|---|---|
| Direct concatenation | [aeiouy] | lowercase vowels |
| Range (hyphen) | [A-M] | uppercase A to M |
| Range concatenation | [A-Za-z0-9] | any alphanumerical |
| Complement (caret) | [^0-9] | any non-numerical |
| Meta | Matches Any | Equivalent |
|---|---|---|
| . (dot) | not new-line | [^\n] |
| \d | digit | [0-9] |
| \w | alphanum. or underscore | [a-zA-Z0-9_] |
| \s | white-space | [\n\r\t\f] |
Uppercase meta-chars. match the complement set instead.
For example, \s matches non-white-space characters,
which is equivalent to the bracket [^\n\r\t\f].
| Indicator | Matches |
|---|---|
| ? | either 0 or 1 |
| * | 0 or more |
| + | 1 or more |
| {M,} | M or more |
| {M} | exactly M |
| {M,N} | M to N, both inclusive |
Suffixes changing the quantity of the prior.
For example, a{4} is equivalent
to aaaa.
| Flag | Name | Change |
|---|---|---|
| g | global | All matches returned (vs. first) |
| m | multi-line | ^/$ by line (vs. entire string) |
| i | case-insensitive | Letters turn case-insensitive |
| s | dot-all | Dot also matches \n |
The listed flags are those available in most Regex engines.
They are used to change the behavior of the matching algorithm.
Each implementation has a specific way to set the flags.
In Javascript, in RegExp constructor or suffix of forward-slash notation,
In Python re, provided as optional argument to callables.
In Regex commands, as regular flag.
Parenthesis are used to mark capture groups. By index, starting at 1, they can be referenced in the replacements. Different notations exist, depending on the tool.
\1 in most commands like sed or awk,
vim and most GNOME editors, Python re in raw strings.
\\1 in above cases, but where the string is parsing escape codes,
for example in Python re in normal strings.
$1 tends to be used in programming languages (Java, Perl, JS)
and popular editors (VS Code, IntelliJ)
\g<1> is used as well in Python re.
\g does not clash with escape codes,
thus escaping it is optional. Can be used with named groups.
Most HTML pages of this site are made by hand, which is enabled by a frequent use of Regex Find-and-Replace. It is faster than manual editing for repetitive patterns. It does not require a new file, unlike scripts. It provides un-doing and counts, unlike command pipelines. It is also possible to target multiple files with a first-step globbing pattern, and a second-step Regex pattern for the text operation.
Search
<address (.*)</address>
<time (.*)</time>
Replace
<time $2</time>
<address $1</address>
The above swaps two elements in-place. It includes their content, plus their attributes by not closing the start-tag. To write correct patterns for the needed job with confidence, it helps to pair hand-made, known mark-up, with good knowledge of HTML specification. In this case, it is known the content of both has no new-lines, which makes dot-star sufficient.
pattern to validate input
This example pattern='[0-9abcdefABCDEF]{6}'
validates the input
values, rejecting strings that are not
hexadecimal values of length six / 24 bits.
It also marks it required,
otherwise length zero is also accepted despite
the mismatching pattern.
Regular Expressions in Javascript can be instantiated
either via the RegExp
constructor, or directly by using
forward-slash instead of regular quotation.
Both have different mechanisms for flagging,
the snippet showing this for the global flag
(a requirement for findAll
and replaceAll operations).
const by_ctor = new RegExp(".*");
const by_fwds = /.*/;
const by_ctor_global = new RegExp(".*", 'g');
const by_fwds_global = /.*/g;
This text sample is editable. ABCDEF | abcdef GHIJKL | g hijkl MNOPQR | mn opqr STUVWX | stu vwx YZ0123 | yz01 23 456789 | 456789 abc def ghi jkl mno pqr stu vwx yz0 123 456 789 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla massa turpis, pretium sit amet erat vitae, porta suscipit tellus. Fusce accumsan elit eu justo vehicula, quis tempus nulla condimentum. Maecenas maximus, dolor nec fermentum eleifend, nisi enim bibendum mauris, ut consequat justo tortor ut enim.