Sunday, November 18, 2007

Getting started with Pattern and Matcher from java.util.regex

Since Java 1.4, we have had pattern matching in the java.util.regex package. This package has lots of power as a consequence of the power of regular expressions, but it does not have the most intuitive interface (in my opinion). In addition, the javadocs are very formal, with little overall "how to use it" examples on the class docs. So, here is a small "getting started with java.util.regex package".

Example: Matching and Looping over Result
     import java.util.regex.Matcher;
import java.util.regex.Pattern;

...

String regex = "Foo";
String input = "FooBarFooBar";

Pattern compiledPattern = Pattern.compile(regex);
Matcher matcher = compiledPattern.matcher(input);

while (matcher.find()) {
String matchedSubString = input.substring(matcher.start(), matcher.end());
System.out.println(String.format("Matched '%s'", matchedSubString));
}
First, we compile the regular expression with this line "Pattern compiledPattern = Pattern.compile(regex)". Then, we match it against an input string, to get a Matcher object, with this line "Matcher matcher = compiledPattern.matcher(input)". The matcher can then be used to go matching against the input, match by match. This is what we do in the loop:
     while (matcher.find()) {
String matchedSubString = input.substring(matcher.start(), matcher.end());
System.out.println(String.format("Matched '%s'", matchedSubString));
}

the find() method on the matcher instance goes looking for the next matched substring in the input and returns true, if found. After a find() call which returned true, you can use matcher.start() and matcher.end(), as substring indices into the input string, to get the string, that was matched.

There is more in the regex package, amongst other is regular expression grouping and replacement functionality. But this should get you easily started.

No comments: