Pattern Matching In Java

What is pattern matching?

Pattern matching is nothing but searching for a specific pattern in a given text.

Suppose you want to search for all occurrences of word "Trade" in the below text, you can use pattern matching.

Example text: "Trade involves the transfer of goods. Possible synonyms of "trade" include "commerce" and "financial transaction". Types of trade include barter. A network that allows trade is called a market."

In Java you can do pattern matching with the help of regular expressions. In this tutorial i am going to explain how to perform pattern matching with core classes and their methods.

Let's take one more example. for example if you have a text that contains a some numbers and some characters (898498398TraderA), and you want to extract only digits(898498398) from that, you can use pattern matching with help of regular expressions.

What Is Regular Expression?

A Regular expression is just a string but it will have some special meaning when used with pattern matching. for example, the regular expression "\d+" represents a number i.e at least 1 digit to any number of digits.

A regular expression can be just a plain string with out any special meaning, in the first example, we saw, "Trade" is a simple regular expression.

Pattern Matching In Java:

Java has classes for Pattern Matching in package java.util.regex. The main classes for pattern matching are Pattern and Matcher.
Package : java.util.regex
Main classes : Pattern, Matcher.

Let's see how to do pattern match in Java using these classes.

Steps:

1. Store your regular expression in a String. for example String regEx = "\\d+(,\\d+)?"; This regular expression is for finding numbers such as ("123","123,456") in a text. Here "\\d" represents a single digit, "+" means at lease once to any times, "?" means 0 or 1 time.
String regEx = "\\d+(,\\d+)?;

2. Get Pattern object by passing regular expression to compile(regEx) method of Pattern class. This defines the search pattern.
Pattern pattern = Pattern.compile(regEx);

3. Get the Matcher object by calling matcher(text) method of Pattern class, by passing your text in which you want to search for the pattern.
String text = "50% of 2,053 kg of milk";

Matcher matcher = pattern.matcher(text);

4. Now you have created objects for Pattern and Matcher classes, the only thing remains is searching for the pattern in the text. There are two methods in Matcher class for this purpose, i.e find() and matches().

5. The difference between find() and matches() is, find() searches for the pattern inside the text, just like calling contains() method of String, where as matches() tries to match the pattern with entire text, just like calling equals() method of String.

6. For example, if you want to check whether the pattern exists in the text, use find() method. If you want to check whether the complete text is similar to the pattern then use matches() method.

7. Both find() and matches() methods return boolean value.

8. If you want to extract the value that is matched with the pattern, call group() method of Matcher class.

Now we will go through a simple program, which extracts numbers from text.
package com.speakingcs.regex;

import java.util.regex.Matcher;

import java.util.regex.Pattern;

public class ExtractNumbers {
 
    public static void main(String[] args) {
   
        String text = "12345 abcd 5678 defg, 1,234 hijk";

        ExtractNumbers en = new ExtractNumbers();

        en.getNumbers(text);

    }

    private void getNumbers(String text) {
        
        String regEx = "\\d+(,\\d+)?";
        
        Pattern pattern = Pattern.compile(regEx);

        Matcher matcher = pattern.matcher(text);
       
        // find()- tries to find the given pattern in the text 

        while(matcher.find()) {

            System.out.println(matcher.group());

        }
     
        // matches()- tries to match entire pattern with the text

        if(matcher.matches()) {

            System.out.println("pattern is matched against entire text");

        } else {

            System.out.println("pattern is not matched with the text");

        }

    }
  
}

The Matcher class has more methods for different purposes like replacing the matched portion(replaceAll()), replacing only first occurence (replaceFirst()), start() (starting index of matched string), end() (end index of matched portion) and so on.

Comments