Skip to content

Python regex

what is regex?¶

Regular Expressions or regex are a tool for matching patterns in text.
Python has a built-in package called re, which can be used to work with Regular Expressions.

find a pattern using python regex¶

Lets look for a ZIP code in the given text.
ZIP code characheristics
- It has length of 6 and all are numbers

import re

address1 = "43 Diamond Harbour Road, Alipore, Kolkata, 700027"
address2 = "781, Golden towers, Zip:500001, Hyderabad"

zip_code_pattern = r'.*(?P<zip_code>[0-9]{6}).*'
output = re.findall(zip_code_pattern, address1)
print(output)
# output: ['700027']
output = re.findall(zip_code_pattern, address2)
print(output)
# output: ['500001']

character sets¶

Pattern	Meaning
\w	Match a single word character a-z, A-Z, 0-9, and underscore (_)
\d	Match a single digit 0-9
\s	Match whitespace including \t, \n, and \r and space character
.	Match any character except the newline
\W	Match a character except for a word character
\D	Match a character except for a digit
\S	Match a single character except for a whitespace character

anchors¶

Pattern	Meaning
^	Match at the beginning of a string
$	Match at the end of a string
\b	Match a position defined as a word boundary
\B	Match a position that is not a word boundary

quantifiers¶

Quantifiers (Greedy)	Non-greedy Quantifiers (Lazy)	Meaning
*	*?	Match its preceding element zero or more times.
+	+?	Match its preceding element one or more times.
?	??	Match its preceding element zero or one time.
{n}	{n}?	Match its preceding element exactly n times.
{n , }	{n,}?	Match its preceding element at least n times.
{n , m}	{n , m}?	Match its preceding element from n to m times

sets & ranges¶

Pattern	Meaning
[XYZ]	Match any of three elements X, Y, and Z
[X-Y]	Match a range from X to Y
^[XYZ]	Match any single element except X, Y, and Z
^[X-Y]	Match any single element
{n , }	Match its preceding element at least n times.
{n , m}	Match its preceding element from n to m times

capturing groups¶

Pattern	Meaning
(X)	Capture the X in the group
(?P<name>X)	Capture the X and assign it the name
\N	Reference the capturing group #N
\g<N>	Reference the capturing group #N (alternative syntax)

alternation¶

Pattern	Meaning
X \| Y	Match either X or Y

look around¶

Pattern	Meaning
X(?=Y)	Match X but only if it is followed by Y
X(?!Y)	Match X but only if it is NOT followed by Y
(?<=Y)X	Match X if there is Y before it
(?<!Y)X	Match X if there is NO Y before it

regex functions¶

Function	Description
findall()	Return a list of matches or None
finditer()	Return an iterator yielding all non-overlapping matches
search()	Return the first match
fullmatch()	Return a Match object if the whole string matches a pattern
match()	Return the match at the beginning of a string or None
sub()	Return a string with matched replaced with a replacement
split()	Split a string at the occurrences of matches

regex flags¶

Flag	Alias	Inline Flag	Meaning
re.ASCII	re.A	?m	The re.ASCII is relevant to the byte patterns only. It makes the \w, \W,\b, \B, \d, \D, and \S perform ASCII-only matching instead of full Unicode matching.
re.DEBUG	N/A	N/A	The re.DEBUG shows the debug information of compiled pattern.
re.IGNORECASE	re.I	?i	perform case-insensitive matching. It means that the [A-Z] will also match lowercase letters.
re.LOCALE	re.L	?L	The re.LOCALE is relevant only to the byte pattern. It makes the \w, \W, \b, \B and case-sensitive matching dependent on the current locale. The re.LOCALE is not compatible with the re.ASCII flag.
re.MUTILINE	re.M	?m	The re.MULTILINE makes the ^ matches at the beginning of a string and at the beginning of each line and $ matches at the end of a string and at the end of each line.
re.DOTALL	re.S	?s	By default, the dot (.) matches any characters except a newline. The re.DOTALL makes the dot (.) matches all characters including a newline.
re.VERBOSE	re.X	?x	The re.VERBOSE flag allows you to organize a pattern into logical sections visually and add comments.

References:¶

Do you like cookies? 🍪 We use cookies to ensure you get the best experience on our website. Learn more

We noticed you're using an ad blocker!

Ads help support our site and keep the content free. Please consider whitelisting our website or disabling your ad blocker.