RegEX CHEAT SHEET

A quick reference for regular expressions (regex), including symbols, ranges, grouping, assertions and some sample patterns to get you started.

regular expressionregexppattern
7
Sections
56
Cards

#Getting Started

Introduction

This is a quick cheat sheet to getting started with regular expressions.

{.cols-2 .marker-round}

Character Classes
PatternDescription
[abc]A single character of: a, b or c
[^abc]A character except: a, b or c
[a-z]A character in the range: a-z
[^a-z]A character not in the range: a-z
[0-9]A digit in the range: 0-9
[a-zA-Z]A character in the range: a-z or A-Z
[a-zA-Z0-9]A character in the range: a-z, A-Z or 0-9

{.style-list}

Quantifiers
PatternDescription
a?Zero or one of a
a*Zero or more of a
a+One or more of a
[0-9]+One or more of 0-9
a{3}Exactly 3 of a
a{3,}3 or more of a
a{3,6}Between 3 and 6 of a
a*Greedy quantifier
a*?Lazy quantifier
a*+Possessive quantifier
Common Metacharacters

| Pattern | Description | | ------- | :----------------------------------------------------------- | -------------------------------------------- | | ^ | Matches the start of a string. | | { | Starts a quantifier for the number of occurrences. | | + | Matches one or more of the preceding element. | | < | Not a standard regex meta character (commonly used in HTML). | | [ | Starts a character class. | | * | Matches zero or more of the preceding element. | | ) | Ends a capturing group. | | > | Not a standard regex meta character (commonly used in HTML). | | . | Matches any character except a newline. | | ( | Starts a capturing group. | | | | Acts as a logical OR within a regex pattern. | | $ | Matches the end of a string. | | \ | Escapes a meta character, giving it literal meaning. | | ? | Matches zero or one of the preceding element. |

{.cols-3 .marker-none}

Escape these special characters with \

Meta Sequences
PatternDescription
.Any single character
\sAny whitespace character
\SAny non-whitespace character
\dAny digit, Same as 0-9
\DAny non-digit, Same as ^0-9
\wAny word character
\WAny non-word character
\XAny Unicode sequences, linebreaks included
\CMatch one data unit
\RUnicode newlines
\vVertical whitespace character
\VNegation of \v - anything except newlines and vertical tabs
\hHorizontal whitespace character
\HNegation of \h
\KReset match
\nMatch nth subpattern
\pXUnicode property X
\p{...}Unicode property or script category
\PXNegation of \pX
\P{...}Negation of \p
\Q...\EQuote; treat as literals
\k<name>Match subpattern name
\k'name'Match subpattern name
\k{name}Match subpattern name
\gnMatch nth subpattern
\g{n}Match nth subpattern
\g<n>Recurse nth capture group
\g'n'Recurses nth capture group.
\g{-n}Match nth relative previous subpattern
\g<+n>Recurse nth relative upcoming subpattern
\g'+n'Match nth relative upcoming subpattern
\g'letter'Recurse named capture group letter
\g{letter}Match previously-named capture group letter
\g<letter>Recurses named capture group letter
\xYYHex character YY
\x{YYYY}Hex character YYYY
\dddOctal character ddd
\cYControl character Y
[\b]Backspace character
\Makes any character literal
Anchors
PatternDescription
\GStart of match
^Start of string
$End of string
\AStart of string
\ZEnd of string
\zAbsolute end of string
\bA word boundary
\BNon-word boundary
Substitution
PatternDescription
\0Complete match contents
\1Contents in capture group 1
$1Contents in capture group 1
${foo}Contents in capture group foo
\x20Hexadecimal replacement values
\x{06fa}Hexadecimal replacement values
\tTab
\rCarriage return
\nNewline
\fForm-feed
\UUppercase Transformation
\LLowercase Transformation
\ETerminate any Transformation
Group Constructs
PatternDescription
(...)Capture everything enclosed
(a|b)Match either a or b
(?:...)Match everything enclosed
(?>...)Atomic group (non-capturing)
(?|...)Duplicate subpattern group number
(?#...)Comment
(?'name'...)Named Capturing Group
(?<name>...)Named Capturing Group
(?P<name>...)Named Capturing Group
(?imsxXU)Inline modifiers
(?(DEFINE)...)Pre-define patterns before using them
Assertions
--
(?(1)yes|no)Conditional statement
(?(R)yes|no)Conditional statement
(?(R#)yes|no)Recursive Conditional statement
(?(R&name\yes|no)Conditional statement
(?(?=...)yes|no)Lookahead conditional
(?(?<=...)yes|no)Lookbehind conditional
Lookarounds
--
(?=...)Positive Lookahead
(?!...)Negative Lookahead
(?<=...)Positive Lookbehind
(?<!...)Negative Lookbehind

Lookaround lets you match a group before (lookbehind) or after (lookahead) your main pattern without including it in the result.

Flags/Modifiers
PatternDescription
gGlobal
mMultiline
iCase insensitive
xIgnore whitespace
sSingle line
uUnicode
XeXtended
UUngreedy
AAnchor
JDuplicate group names
Recurse
--
(?R)Recurse entire pattern
(?1)Recurse first subpattern
(?+1)Recurse first relative subpattern
(?&name)Recurse subpattern name
(?P=name)Match subpattern name
(?P>name)Recurse subpattern name
POSIX Character Classes
Character ClassSame asMeaning
[[:alnum:]][0-9A-Za-z]Letters and digits
[[:alpha:]][A-Za-z]Letters
[[:ascii:]][\x00-\x7F]ASCII codes 0-127
[[:blank:]][\t ]Space or tab only
[[:cntrl:]][\x00-\x1F\x7F]Control characters
[[:digit:]][0-9]Decimal digits
[[:graph:]][[:alnum:][:punct:]]Visible characters (not space)
[[:lower:]][a-z]Lowercase letters
[[:print:]][ -~] == [ [:graph:]]Visible characters
[[:punct:]]!"#$%&’()*+,-./:;<=>?@^_`{|}~Visible punctuation characters
[[:space:]]\t\n\v\f\rWhitespace
[[:upper:]][A-Z]Uppercase letters
[[:word:]][0-9A-Za-z_]Word characters
[[:xdigit:]][0-9A-Fa-f]Hexadecimal digits
[[:<:]][\b(?=\w)]Start of word
[[:>:]][\b(?<=\w)]End of word

{.show-header}

Control verb
--
(*ACCEPT)Control verb
(*FAIL)Control verb
(*MARK:NAME)Control verb
(*COMMIT)Control verb
(*PRUNE)Control verb
(*SKIP)Control verb
(*THEN)Control verb
(*UTF)Pattern modifier
(*UTF8)Pattern modifier
(*UTF16)Pattern modifier
(*UTF32)Pattern modifier
(*UCP)Pattern modifier
(*CR)Line break modifier
(*LF)Line break modifier
(*CRLF)Line break modifier
(*ANYCRLF)Line break modifier
(*ANY)Line break modifier
\RLine break modifier
(*BSR_ANYCRLF)Line break modifier
(*BSR_UNICODE)Line break modifier
(*LIMIT_MATCH=x)Regex engine modifier
(*LIMIT_RECURSION=d)Regex engine modifier
(*NO_AUTO_POSSESS)Regex engine modifier
(*NO_START_OPT)Regex engine modifier

#Regex examples

Characters
PatternMatches
ring Match ring springboard etc.
. Match a, 9, + etc.
h.o Match hoo, h2o, h/o etc.
ring\? Match ring?
\(quiet\) Match (quiet)
c:\\windows Match c:\windows

Use \ to search for these special characters:
[ \ ^ $ . | ? * + ( ) { }

Alternatives
PatternMatches
cat|dogMatch cat or dog
id|identityMatch id or identity
identity|idMatch id or identity

Order longer to shorter when alternatives overlap

Character classes
PatternMatches
[aeiou]Match any vowel
[^aeiou]Match a NON vowel
r[iau]ngMatch ring, wrangle, sprung, etc.
gr[ae]yMatch gray or grey
[a-zA-Z0-9]Match any letter or digit
[\u3a00-\ufa99]Match any Unicode Hàn (中文)

In [ ] always escape . \ ] and sometimes ^ - .

Shorthand classes
PatternMeaning
\w "Word" character
(letter, digit, or underscore)
\d Digit
\s Whitespace
(space, tab, vtab, newline)
\W, \D, or \S Not word, digit, or whitespace
[\D\S] Means not digit or whitespace, both match
[^\d\s] Disallow digit and whitespace
Occurrences
PatternMatches
colou?rMatch color or colour
[BW]ill[ieamy's]*Match Bill, Willy, William's etc.
[a-zA-Z]+Match 1 or more letters
\d{3}-\d{2}-\d{4}Match a SSN
[a-z]\w{1,7}Match a UW NetID
Greedy versus lazy
PatternMeaning
* + {n,}
greedy
Match as much as possible
<.+> Finds 1 big match in <b>bold</b>
*? +? {n,}?
lazy
Match as little as possible
<.+?>Finds 2 matches in <b>bold</b>
Scope
PatternMeaning
\b "Word" edge (next to non "word" character)
\bring Word starts with "ring", ex ringtone
ring\b Word ends with "ring", ex spring
\b9\b Match single digit 9, not 19, 91, 99, etc..
\b[a-zA-Z]{6}\b Match 6-letter words
\B Not word edge
\Bring\B Match springs and wringer
^\d*$ Entire string must be digits
^[a-zA-Z]{4,20}$String must have 4-20 letters
^[A-Z] String must begin with capital letter
[\.!?"')]$ String must end with terminal puncutation
Modifiers
PatternMeaning
(?i)a-z*(?-i)Ignore case ON / OFF
(?s).*(?-s)Match multiple lines (causes . to match newline)
(?m)^.*;$(?-m)^ & $ match lines not whole string
(?x)#free-spacing mode, this EOL comment ignored
(?-x)free-spacing mode OFF
/regex/ismxModify mode for entire string
Groups
PatternMeaning
(in|out)putMatch input or output
\d{5}(-\d{4})?US zip code ("+ 4" optional)

Parser tries EACH alternative if match fails after group.
Can lead to catastrophic backtracking.

Back references
PatternMatches
(to) (be) or not \1 \2Match to be or not to be
([^\s])\1{2}Match non-space, then same twice more   aaa, ...
\b(\w+)\s+\1\bMatch doubled words
Non-capturing group
PatternMeaning
on(?:click|load)Faster than:
on(click|load)

Use non-capturing or atomic groups when possible

Atomic groups
PatternMeaning
(?>red|green|blue)Faster than non-capturing
(?>id|identity)\bMatch id, but not identity

"id" matches, but \b fails after atomic group, parser doesn't backtrack into group to retry 'identity'

If alternatives overlap, order longer to shorter.

Lookaround
PatternMeaning
(?= )Lookahead, if you can find ahead
(?! )Lookahead,if you can not find ahead
(?<= )Lookbehind, if you can find behind
(?<! )Lookbehind, if you can NOT find behind
\b\w+?(?=ing\b)Match warbling, string, fishing, ...
\b(?!\w+ing\b)\w+\bWords NOT ending in ing
(?<=\bpre).*?\b Match pretend, present, prefix, ...
\b\w{3}(?<!pre)\w*?\bWords NOT starting with pre
\b\w+(?<!ing)\bMatch words NOT ending in ing
If-then-else

Match "Mr." or "Ms." if word "her" is later in string

M(?(?=.*?\bher\b)s|r)\.

requires lookaround for IF condition

#RegEx in Python

Getting started

Import the regular expressions module

import re
Examples

re.search()

>>> sentence = 'This is a sample string'
>>> bool(re.search(r'this', sentence, flags=re.I))
True
>>> bool(re.search(r'xyz', sentence))
False

re.findall()

>>> re.findall(r'\bs?pare?\b', 'par spar apparent spare part pare')
['par', 'spar', 'spare', 'pare']
>>> re.findall(r'\b0*[1-9]\d{2,}\b', '0501 035 154 12 26 98234')
['0501', '154', '98234']

re.finditer()

>>> m_iter = re.finditer(r'[0-9]+', '45 349 651 593 4 204')
>>> [m[0] for m in m_iter if int(m[0]) < 350]
['45', '349', '4', '204']

re.split()

>>> re.split(r'\d+', 'Sample123string42with777numbers')
['Sample', 'string', 'with', 'numbers']

re.sub()

>>> ip_lines = "catapults\nconcatenate\ncat"
>>> print(re.sub(r'^', r'* ', ip_lines, flags=re.M))
* catapults
* concatenate
* cat

re.compile()

>>> pet = re.compile(r'dog')
>>> type(pet)
<class '_sre.SRE_Pattern'>
>>> bool(pet.search('They bought a dog'))
True
>>> bool(pet.search('A cat crossed their path'))
False
Functions
FunctionDescription
re.findallReturns a list containing all matches
re.finditerReturn an iterable of match objects (one for each match)
re.searchReturns a Match object if there is a match anywhere in the string
re.splitReturns a list where the string has been split at each match
re.subReplaces one or many matches with a string
re.compileCompile a regular expression pattern for later use
re.escapeReturn string with all non-alphanumerics backslashed
Flags
---
re.Ire.IGNORECASEIgnore case
re.Mre.MULTILINEMultiline
re.Lre.LOCALEMake \w,\b,\s locale dependent
re.Sre.DOTALLDot matches all (including newline)
re.Ure.UNICODEMake \w,\b,\d,\s unicode dependent
re.Xre.VERBOSEReadable style

#Regex in JavaScript

test()
let textA = 'I like APPles very much';
let textB = 'I like APPles';
let regex = /apples$/i;

// Output: false
console.log(regex.test(textA));

// Output: true
console.log(regex.test(textB));
search()
let text = 'I like APPles very much';
let regexA = /apples/;
let regexB = /apples/i;

// Output: -1
console.log(text.search(regexA));

// Output: 7
console.log(text.search(regexB));
exec()
let text = 'Do you like apples?';
let regex = /apples/;

// Output: apples
console.log(regex.exec(text)[0]);

// Output: Do you like apples?
console.log(regex.exec(text).input);
match()
let text = 'Here are apples and apPleS';
let regex = /apples/gi;

// Output: [ "apples", "apPleS" ]
console.log(text.match(regex));
split()
let text = 'This 593 string will be brok294en at places where d1gits are.';
let regex = /\d+/g;

// Output: [ "This ", " string will be brok", "en at places where d", "gits are." ]
console.log(text.split(regex));
matchAll()
let regex = /t(e)(st(\d?))/g;
let text = 'test1test2';
let array = [...text.matchAll(regex)];

// Output: ["test1", "e", "st1", "1"]
console.log(array[0]);

// Output: ["test2", "e", "st2", "2"]
console.log(array[1]);
replace()
let text = 'Do you like aPPles?';
let regex = /apples/i;

// Output: Do you like mangoes?
let result = text.replace(regex, 'mangoes');
console.log(result);
replaceAll()
let regex = /apples/gi;
let text = 'Here are apples and apPleS';

// Output: Here are mangoes and mangoes
let result = text.replaceAll(regex, 'mangoes');
console.log(result);

#Regex in PHP

Functions
--
preg_match()Performs a regex match
preg_match_all()Perform a global regular expression match
preg_replace_callback()Perform a regular expression search and replace using a callback
preg_replace()Perform a regular expression search and replace
preg_split()Splits a string by regex pattern
preg_grep()Returns array entries that match a pattern
preg_replace
$str = "Visit Microsoft!";
$regex = "/microsoft/i";

// Output: Visit CheatSheets!
echo preg_replace($regex, "CheatSheets", $str);
preg_match
$str = "Visit CheatSheets";
$regex = "#cheatsheets#i";

// Output: 1
echo preg_match($regex, $str);
preg_matchall
$regex = "/[a-zA-Z]+ (\d+)/";
$input_str = "June 24, August 13, and December 30";
if (preg_match_all($regex, $input_str, $matches_out)) {

    // Output: 2
    echo count($matches_out);

    // Output: 3
    echo count($matches_out[0]);

    // Output: Array("June 24", "August 13", "December 30")
    print_r($matches_out[0]);

    // Output: Array("24", "13", "30")
    print_r($matches_out[1]);
}
preg_grep
$arr = ["Jane", "jane", "Joan", "JANE"];
$regex = "/Jane/";

// Output: Jane
echo preg_grep($regex, $arr);
preg_split
$str = "Jane\tKate\nLucy Marion";
$regex = "@\s@";

// Output: Array("Jane", "Kate", "Lucy", "Marion")
print_r(preg_split($regex, $str));

#Regex in Java

Styles

First way

Pattern p = Pattern.compile(".s", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("aS");
boolean s1 = m.matches();
System.out.println(s1);   // Outputs: true

Second way

boolean s2 = Pattern.compile("[0-9]+").matcher("123").matches();
System.out.println(s2);   // Outputs: true

Third way

boolean s3 = Pattern.matches(".s", "XXXX");
System.out.println(s3);   // Outputs: false
Pattern Fields
--
CANON_EQCanonical equivalence
CASE_INSENSITIVECase-insensitive matching
COMMENTSPermits whitespace and comments
DOTALLDotall mode
MULTILINEMultiline mode
UNICODE_CASEUnicode-aware case folding
UNIX_LINESUnix lines mode
Methods

Pattern

  • Pattern compile(String regex , int flags)
  • boolean matches(String regex, CharSequence input)
  • String split(String regex , int limit)
  • String quote(String s)

Matcher

  • int start(int group | String name)
  • int end(int group | String name)
  • boolean find(int start)
  • String group(int group | String name)
  • Matcher reset()

String

  • boolean matches(String regex)
  • String replaceAll(String regex, String replacement)
  • String split(String regex, int limit)

There are more methods ...

Examples

Replace sentence:

String regex = "[A-Z\n]{5}$";
String str = "I like APP\nLE";

Pattern p = Pattern.compile(regex, Pattern.MULTILINE);
Matcher m = p.matcher(str);

// Outputs: I like Apple!
System.out.println(m.replaceAll("pple!"));

Array of all matches:

String str = "She sells seashells by the Seashore";
String regex = "\\w*se\\w*";

Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(str);

List<String> matches = new ArrayList<>();
while (m.find()) {
    matches.add(m.group());
}

// Outputs: [sells, seashells, Seashore]
System.out.println(matches);

#Regex in MySQL

Functions
NameDescription
REGEXP Whether string matches regex
REGEXP_INSTR() Starting index of substring matching regex
(NOTE: Only MySQL 8.0+)
REGEXP_LIKE() Whether string matches regex
(NOTE: Only MySQL 8.0+)
REGEXP_REPLACE()Replace substrings matching regex
(NOTE: Only MySQL 8.0+)
REGEXP_SUBSTR() Return substring matching regex
(NOTE: Only MySQL 8.0+)
REGEXP
expr REGEXP pat

Examples

mysql> SELECT 'abc' REGEXP '^[a-d]';
1
mysql> SELECT name FROM cities WHERE name REGEXP '^A';
mysql> SELECT name FROM cities WHERE name NOT REGEXP '^A';
mysql> SELECT name FROM cities WHERE name REGEXP 'A|B|R';
mysql> SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
1   0
REGEXP_REPLACE
REGEXP_REPLACE(expr, pat, repl[, pos[, occurrence[, match_type]]])

Examples

mysql> SELECT REGEXP_REPLACE('a b c', 'b', 'X');
a X c
mysql> SELECT REGEXP_REPLACE('abc ghi', '[a-z]+', 'X', 1, 2);
abc X
REGEXP_SUBSTR
REGEXP_SUBSTR(expr, pat[, pos[, occurrence[, match_type]]])

Examples

mysql> SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+');
abc
mysql> SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3);
ghi
REGEXP_LIKE
REGEXP_LIKE(expr, pat[, match_type])

Examples

mysql> SELECT regexp_like('aba', 'b+')
1
mysql> SELECT regexp_like('aba', 'b{2}')
0
mysql> # i: case-insensitive
mysql> SELECT regexp_like('Abba', 'ABBA', 'i');
1
mysql> # m: multi-line
mysql> SELECT regexp_like('a\nb\nc', '^b$', 'm');
1
REGEXP_INSTR
REGEXP_INSTR(expr, pat[, pos[, occurrence[, return_option[, match_type]]]])

Examples

mysql> SELECT regexp_instr('aa aaa aaaa', 'a{3}');
2
mysql> SELECT regexp_instr('abba', 'b{2}', 2);
2
mysql> SELECT regexp_instr('abbabba', 'b{2}', 1, 2);
5
mysql> SELECT regexp_instr('abbabba', 'b{2}', 1, 3, 1);
7