Python Regular Expression (RegEX)

One of today’s most popular programming languages, Python has many powerful features that enable data scientists and analysts to extract real value from data. One of those, regular expressions in Python, are special collections of characters used to describe or search for patterns in a given string. They are mainly used for data cleaning or pattern matching in text files.  

Want a Top Software Development Job? Start Here!

Full Stack Developer - MERN StackExplore Program
Want a Top Software Development Job? Start Here!

What is Python Regular Expression (RegEx)?

A Python regular expression is a sequence of metacharacters that define a search pattern. We use these patterns in a string-searching algorithm to "find" or "find and replace" on strings. 

The term "regular expressions" is frequently shortened to "RegEx".

In this guide, we will learn the basics of regular expressions in Python through a demonstration We will begin by importing our “re” module. 

importing

Fig: Importing regular expression module (re)

RegEx Functions

The “re” module provides a set of functions that enables us to search a string for a match. Some of the functions are listed below:

  • findall() function

The findall() function returns a list containing all matches.

Example:

findall

Fig: findall() function

  • search() function  

The search() function takes a regular expression pattern and a string, and it searches for that pattern within the string. If the search is successful, search() returns a match object. Otherwise, it doesn’t return any.

Example:

dearch

Fig: search() function

  • split() function

The split() function returns a list that shows where the string has been split at each match.

Example:

Fig: split() function

It splits the string when it encounters space in a string.

Want a Top Software Development Job? Start Here!

Full Stack Developer - MERN StackExplore Program
Want a Top Software Development Job? Start Here!

Python RegEx: Metacharacters

Every character in a Python RegEx is either a metacharacter or a regular character. A metacharacter has a special meaning, whereas a regular character matches itself.

Some of the basic metacharacters used in RegEx include:

  • “^” 

The ‘ ^ ’ character checks if the string starts with a particular word or character.

Example: 

character

Fig:  ^ character in Python RegEx

  • “$”

The ‘$‘ character checks if the string ends with a particular word or character.

Example:

  • “|”

The ‘ | ‘ character is used to check either/or condition.

Example:

Fig: | character

  • “+”

This matches one or more occurrences of a character in a string.

Example:

character-5

Fig: + character

  • “*”

This returns the zero or more occurrences of a character in a string.

Example:

/character6

Fig: * character

Characters

Description

{ }

Matches exactly the specified number of occurrences

*

Matches zero or more occurrences

+

Matches one or more occurrences

[ ] 

Matches a set of characters

Want a Top Software Development Job? Start Here!

Full Stack Developer - MERN StackExplore Program
Want a Top Software Development Job? Start Here!

Python RegEx - Special sequences

A special sequence is a ‘ \ ‘ symbol, followed by one of the particular characters.  Some special sequences include:

  • \A

This checks if the string starts with a particular character.

Example:

sequence

Fig: \A sequence in Python RegEx

  • “\s”

The \s sequence returns a match when the string contains white space characters.

Example:

sequence2

Fig: \s sequence

  • “\d”

The \d sequence checks if there are any digits in the given string.

Example:

sequence3

Fig: \d sequence

  • “\Z”

The \Z sequence checks if the string ends with a particular word.

Example:

Fig: \Z sequence

  • “\w”

The \w sequence returns a match at every word character.

Example:

sequencw.

Fig: \w sequence in Python RegEx

Python RegEx - Sets

Characters

Description

[ agh ]

Returns a match when any of the mentioned characters is present

[ a-h ]

Returns a match for lower case characters between a and h

[ ^agh ]

Returns a match for every character except a,g,and h

  • “[agh]”

Example:

sets

Fig: Sets example

  • “[a-h]”

Example:

Fig: Sets example in Python RegEx

  • “[^agh]”

Example:

sets3.

Fig: Sets example

Looking forward to make a move to programming? Take up the Python Training Course and begin your career as a professional Python programmer.

Conclusion

In this article, we discussed Python regular expressions. We looked at various functions and metacharacters in Python RegEx through demonstrations. 

If you have any questions, please ask them in the comments section, and we'll have our experts answer them for you promptly.

About the Author

Aryan GuptaAryan Gupta

Aryan is a tech enthusiast who likes to stay updated about trending technologies of today. He is passionate about all things technology, a keen researcher, and writes to inspire. Aside from technology, he is an active football player and a keen enthusiast of the game.

View More
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.