Regular Expressions200725 | Notes on regular expressions.

Excerpts from this tutorial.

Regular expressions are useful for extracting information from any text by searching one or more matches of a specific pattern.

A regex usually comes within this form /abc/ where the search pattern is delimited by two slash characters /.

Anchors ^ $

^The        matches any string that starts with The
end$        matches a string that ends with end
^The end$   exact string match (starts and ends with The end)
long        matches any string that **has the text long in it

Quantifiers * +

abc*        matches a string that has ab followed by zero or more c
abc+        matches a string that has ab followed by one or more c
abc?        matches a string that has ab followed by zero or one c

Character classes

User uppercase to negate.

\d         matches a single character that is a digit
\w         matches a word character (alphanumeric character plus underscore)
\s         matches a whitespace character (includes tabs and line breaks)
.          matches any character


In order to be taken literally, you must escape the characters ^.[$()|*+?{\ with a backslash \ as they have special meaning.


g (global) 
does not return after the first match, restarting the subsequent searches from the end of the previous match

m (multi-line)   
when enabled ^ and $ will match the start and end of a line, instead of the whole string

i (insensitive)  
makes the whole expression case-insensitive (for instance /aBc/i would match AbC)