The Java Developers Almanac 1.4


Order this book from Amazon.

   
Home > List of Packages > java.util.regex  [26 examples] > Groups  [5 examples]

e436. Capturing Text in a Group in a Regular Expression

A group is a pair of parentheses used to group subpatterns. For example, h(a|i)t matches hat or hit. A group also captures the matching text within the parentheses. For example,
    input:   abbc
    pattern: a(b*)c
causes the substring bb to be captured by the group (b*). A pattern can have more than one group and the groups can be nested. For example,
    pattern: (a(b*))+(c*)
contains three groups:
    group 1: (a(b*))
    group 2: (b*)
    group 3: (c*)
The groups are numbered from left to right, outside to inside. There is an implicit group 0, which contains the entire match. Here is an example of what is captured in groups. Notice that group 1 was applied twice, once to the input abb and then to the input ab. Only the most recent match is captured. Note that when using * on a group and the group matches zero times, the group will not be cleared. In particular, it will hold the most recently captured text. For example,
    input:   aba
    pattern: (a(b)*)+
    group 0: aba
    group 1: a
    group 2: b
Group 1 first matched ab capturing b in group 2. Group 1 then matched the a with group 2 matching zero bs, therefore leaving intact the previously captured b.

Note: If it is not necessary for a group to capture text, you should use a non-capturing group since it is more efficient. For more information, see e438 Using a Non-Capturing Group in a Regular Expression.

This example demonstrates how to retrieve the text in a group.

    CharSequence inputStr = "abbabcd";
    String patternStr = "(a(b*))+(c*)";
    
    // Compile and use regular expression
    Pattern pattern = Pattern.compile(patternStr);
    Matcher matcher = pattern.matcher(inputStr);
    boolean matchFound = matcher.find();
    
    if (matchFound) {
        // Get all groups for this match
        for (int i=0; i<=matcher.groupCount(); i++) {
            String groupStr = matcher.group(i);
        }
    }

 Related Examples
e437. Getting the Indices of a Matching Group in a Regular Expression
e438. Using a Non-Capturing Group in a Regular Expression
e439. Using the Captured Text of a Group within a Pattern
e440. Using the Captured Text of a Group within a Replacement Pattern

See also: Flags    Lines    Paragraphs    Searching and Replacing    Tokenizing   


© 2002 Addison-Wesley.