Regular Expressions200725 | Notes on regular expressions.
Excerpts from this tutorial.
Regular expressions are useful for extracting information from any text by searching one or more matches of a specific pattern.
A regex usually comes within this form /abc/
where the search pattern is delimited by two slash characters /
.
Anchors ^ $
^The matches any string that starts with The
end$ matches a string that ends with end
^The end$ exact string match (starts and ends with The end)
long matches any string that **has the text long in it
Quantifiers * +
abc* matches a string that has ab followed by zero or more c
abc+ matches a string that has ab followed by one or more c
abc? matches a string that has ab followed by zero or one c
Character classes
User uppercase to negate.
\d matches a single character that is a digit
\w matches a word character (alphanumeric character plus underscore)
\s matches a whitespace character (includes tabs and line breaks)
. matches any character
Escaping
In order to be taken literally, you must escape the characters ^.[$()|*+?{\
with a backslash \
as they have special meaning.
Flags
g (global)
does not return after the first match, restarting the subsequent searches from the end of the previous match
m (multi-line)
when enabled ^ and $ will match the start and end of a line, instead of the whole string
i (insensitive)
makes the whole expression case-insensitive (for instance /aBc/i would match AbC)