Regex In Java

Regex is the short form of regular expressions. It defines the string pattern, which is used for searching, manipulating, and editing a text. A sequence of characters forming a search pattern is a regular expression. This search pattern can be used to describe the desired search data in a text, and the pattern can be a single character or complicated. All types of text search and replace can be performed using a regular expression. To work with regular expressions, the java.util.regex package needs to be imported as java does not have a built-in regular expression class.

Three Classes of Regex

The regex package consists of the classes, which are used for matching character sequences against regular expressions in the specified pattern. These classes are available in the java.util.regex package:

(i) Pattern class

(ii) Matcher class

(iii) PatternSyntaxException class

Pattern Class:

It compiles the pattern class when the regular expressions are considered. It has no public constructors and one of the public static compile() methods need to be invoked to create a pattern which in turn in the pattern object for these methods, the regular expression will be the first argument.

Full Stack Java Developer Course

The Gateway to Master Web DevelopmentExplore Course
Full Stack Java Developer Course

Matcher Class:

Matcher class is the engine interpreting the pattern, and it will do match operations against the input string. It has no public constructors like pattern class. You can obtain matcher objects by invoking the matcher () method.

PatternSyntaxException:

If PatternSyntaxException is thrown, it means that there is a syntax error in a regular expression pattern. 

Example 1:

Code 1:

import java.util.regex.*; 

public class CodeExample {

public static void main(String[] arg) {

// first way   

// pattern is inbuilt class for regex in java

Pattern pattern = Pattern.compile(".s");// dot(.) represents single character  

//matcher is inbuilt class  for creating a matcher that matches the given input with the pattern.

Matcher matcher = pattern.matcher("as");  

boolean result_first = matcher.matches();  

//2nd way  

boolean result_second=Pattern.compile(".s").matcher("as").matches();  // single line search pattern  

//3rd way  

boolean result_third = Pattern.matches(".s", "as"); // shorthand search pattern

System.out.println(result_first); // second character is ‘s’ so its matches  true

System.out.println(result_second); //second character is ‘s’ so its matches  true

System.out.println(result_third); //second character is ‘s’ so its matches  true

}

}

Output:

true

true

true

Screenshots

RegexInJavaEx1_1

RegexInJavaEx1_2

Methods:

The corresponding class and its methods that are used by Regex in Java are listed below:

(i) Pattern Class Methods:

(a) Static Pattern compile (string regex): It is used to compile the given regex, and to return the instance of the pattern.

(b) Matcher matcher (char sequence input): It generates the matcher that matches the given input with the pattern.

(c) Static Boolean matches (string regex, char sequence input): It works as a combination of assembling and matcher methods. It assembles the regular expression and matches the pattern in the given input.

(d) String [] split (char sequence input): It is used to split the string for the given input around matches of the given pattern.

 (e). Sting pattern (): The regex in the Java pattern will be returned.

(ii) Master Class: 

(a). Boolean matches(): It is used to find out if the regular expression matches the given pattern, or not.

(b). Boolean find (): It is used to find the next expression that matches the pattern.

(c). Boolean find (int start): It is used to find the next expression that matches the pattern from the given start number.

(d). string group (): It is used to return the matched subsequence.

(e). int start (): It is used to return the starting index of the matched subsequence.

(f). int end (): It is used to return the ending index of the matched subsequence.

(g). int groupcount (): It is used to return the total number of matched subsequences in regex in java.

Example 2: 

Code 

import java.util.regex.*; 

public class CodeExample {

public static void main(String[] arg) {

// . represent actual position of the character and it's a  last character of the word 

boolean result_first = Pattern.matches(".s", "as");

boolean result_second =Pattern.matches(".s", "mk");

boolean result_third = Pattern.matches(".s", "mst");

boolean result_fourth =Pattern.matches(".s", "amms");

boolean result_fifth =Pattern.matches("..s", "mas");

boolean result_six =Pattern.matches("...s", "amms");

System.out.println(result_first);//true (2nd char is s)  

System.out.println(result_second);//false (2nd char is not s)  

System.out.println(result_third);//false (has more than 2 char)  

System.out.println(result_fourth);//false (has more than 2 char)  

System.out.println(result_fifth);//true (3rd char is s)  

System.out.println(result_six);//true (4rd char is s)  

}

}

Output

true

false

false

false

true

true

Screenshots

RegexInJavaEx2_1

RegexInJavaEx2_2

Regex in Java Character Class

The square brackets “[]” are used to define the character class in Java's regular expression. It is used to match a single character from the specified or set of possible characters.

 Negation:

“[^]” It defines the symbol as the negation variant of the character class. It matches all the characters that are not specified in the character class in regex in java.

(eg) (i). The Regex [^AB] matches any characters other than A and B.

(ii). The Regex [^\d] matches any characters except digits.

Range:

The range variant of the character class allows the usage of a range of characters.(eg) (i). The Regex [a-z] matches a single character from the alphabets [a-z].

(ii). The Regex [^A-Z] matches a character that is not a capital letter.

Union:

The union variant of the character class matches the character from one of the specified ranges.

(eg) (i). The Regex [a-z[0-9]] matches character, which is either a small alphabet (a-z) or digit (0-9).

Intersection:

The intersection variant of the character class matches the common character in the intersection range between them.

(eg) The Regex [a-z&&[r-u]] matches a single character from r to u, where && is used to define the intersection relation between the ranges.

Subtraction:

A new range can be used to subtract one range from another range. It can be achieved using two variants of character class.

(eg). The Regex [a-l]and [^e-h] gives the character a to l as range subtracting the character [e-h].

Regex in Java Quantifiers:

The number of instances of a character class or group is specified by quantifiers’ regex in java. These quantifiers should present as input parameters while finding the match.

(a). Question mark ?: It is used to make the preceding item optional.

(b). Plus +: It is used to repeat the previous item one or more times. It will be matched before trying permutations with fewer matches of the preceding item up to the point where the preceding item is matched only once.

(c). Star *: It is used to repeat the previous item zero or more items. It will be matched before trying permutations with fewer matches of the preceding item up to the point where the preceding item is not matched at all.

(d). {n}: It is used to repeat the previous item exactly n times.

(e). {n,}: It is used to repeat the previous item at least n times.

(f). {n, m\}: It is used to repeat the previous item between n and m times.

Example 3: Quantifier

Code 

import java.util.regex.*; 

public class CodeExample {

public static void main(String[] arg) {

System.out.println("? quantifier ....");// ?  quantifier for check character occurred only one time in word

System.out.println(Pattern.matches("[amen]?", "a"));//true (a or m or n comes one time)  

System.out.println(Pattern.matches("[amen]?", "aaa"));//false (a comes more than one time)  

System.out.println(Pattern.matches("[amen]?", "aammmnn"));//false (a m and  n comes more than one time)  

System.out.println(Pattern.matches("[amen]?", "aazzta"));//false (a comes more than one time)  

System.out.println(Pattern.matches("[amen]?", "am"));//false (a or m or e or n must come one time)  

System.out.println("+ quantifier ....");  //  +  quantifier for check character occurred  one or more time in word

// always false if unmatched pattern occurred in word

System.out.println(Pattern.matches("[amen]+", "a"));//true (a or m or e or n once or more times)  

System.out.println(Pattern.matches("[amen]+", "aaa"));//true (a comes more than one time)  

System.out.println(Pattern.matches("[amen]+", "aammmnn"));//true (a or m or e or n comes more than once)  

System.out.println(Pattern.matches("[amen]+", "aazzta"));//false (z and t are not matching pattern)  

System.out.println("* quantifier ....");  //*   quantifier for check any pattern character occurred  zero or more time

//in word

System.out.println(Pattern.matches("[amen]*", "ammmna"));//true (a or m or n may come zero or more times) 

System.out.println(Pattern.matches("[amen]*", "brrrccdd"));//false (a or m or e or n comes zero) 

}

}

Output

? quantifier ....

true

false

false

false

false

+ quantifier ....

true

true

true

false

* quantifier ....

true

false

Screenshots

RegexInJavaEx3_1.
RegexInJavaEx3_2

FREE Java Certification Training

Learn A-Z of Java like never beforeEnroll Now
FREE Java Certification Training

Regex in Java Metacharacters:

Metacharacters are used to describe search criteria and text manipulation, and they have a special significance when it comes to identifying patterns. It matches any character.

The Regex. Matches any character

The Regex \d matches any digits short of [0-9]

The Regex \D matches any non-digits short for [^0-9]

The Regex \s matches any whitespace character short for [\t\n\xOB\f\r]

The Regex \S matches any non-whitespace character short for [^\S]

The Regex \w matches any word character short for [a-zA-Z_0-9]

The Regex \W matches any non-word character short for [^\w]

The Regex \b matches a word boundary

The Regex \B matches a non-word boundary.

Example 4: Metacharacters

Code

import java.util.regex.*; 

public class CodeExample {

public static void main(String[] arg) {

System.out.println("metacharacters d...."); // two backslash with lower case d (\\d) means digit  

// checks only one digit occurred in word without

System.out.println(Pattern.matches("\\d", "abc"));//false (non-digit)  

System.out.println(Pattern.matches("\\d", "1"));//true (digit and comes once)  

System.out.println(Pattern.matches("\\d", "4443"));//false (digit but comes more than once)  

System.out.println(Pattern.matches("\\d", "323abc"));//false (digit and char)  

System.out.println("metacharacters D...."); // two backslash with upper case D (\\D) means non-digit  

// checks only one non digit occurred in word   

System.out.println(Pattern.matches("\\D", "abc"));//false (non-digit but comes more than once)  

System.out.println(Pattern.matches("\\D", "1"));//false (digit)  

System.out.println(Pattern.matches("\\D", "4443"));//false (digit)  

System.out.println(Pattern.matches("\\D", "323abc"));//false (digit and char)  

System.out.println(Pattern.matches("\\D", "m"));//true (non-digit and comes once)  

System.out.println("metacharacters D with quantifier....");  // two backslash with metacharacters D (D*) 

// checks non digit comes zero or more times

System.out.println(Pattern.matches("\\D*", "mark"));//true (non-digit and may come 0 or more times)  

}

}

Output

metacharacters d....

false

true

false

false

metacharacters D....

false

false

false

false

true

metacharacters D with quantifier....

true

Screenshots

RegexInJavaEx4_1

RegexInJavaEx4_2

Get a firm foundation in Java, the most commonly used programming language in software development with the Java Certification Training Course.

Conclusion:

Regex (short for Regular Expressions) is a Java API for defining String patterns that can be used to scan, manipulate, and modify strings. Regex is commonly used to describe the constraints in several areas of strings, including email passwords and validation. The java.util.regex package contains regular expressions. This is made up of three classes and one interface.

It is commonly used to specify constraints on strings like password and email validation. You can use the Java Regex Tester Tool to evaluate your regular expressions after learning the Java regex tutorial.

If you want to master web application creation for every programming platform, then Simplilearn's Java Certification Training course is for you. This course will provide you with a solid understanding of Java, the most widely used programming language in software development. It's a 60-hour Applied Learning course that explores Hibernate and Spring structures, as well as offers hands-on coding and implementation of two web-based projects.

If you have any queries or feedback for us on this regex in Java article, so leave them below in the comments section. Our team of subject matter experts will review and revert with responses as soon as possible.

About the Author

SimplilearnSimplilearn

Simplilearn is one of the world’s leading providers of online training for Digital Marketing, Cloud Computing, Project Management, Data Science, IT, Software Development, and many other emerging technologies.

View More
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.