60

I try to get string between <%= and %>, here is my implementation:

String str = "ZZZZL <%= dsn %> AFFF <%= AFG %>";
Pattern pattern = Pattern.compile("<%=(.*?)%>");
String[] result = pattern.split(str);
System.out.println(Arrays.toString(result));

it return

[ZZZZL ,  AFFF ]

But my expectation is:

[ dsn , AFG ]

Where am i wrong and how to correct it ?

1
  • 5
    It's appears you're confusing splitting a string with pattern matching. Commented May 16, 2013 at 20:57

4 Answers 4

96

Your pattern is fine. But you shouldn't be split()ting it away, you should find() it. Following code gives the output you are looking for:

String str = "ZZZZL <%= dsn %> AFFF <%= AFG %>";
Pattern pattern = Pattern.compile("<%=(.*?)%>", Pattern.DOTALL);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
    System.out.println(matcher.group(1));
}
Sign up to request clarification or add additional context in comments.

1 Comment

I added Pattern.DOTALL for those quite common cases when the expected match spans across multiple lines.
60

I have answered this question here: https://stackoverflow.com/a/38238785/1773972

Basically use

StringUtils.substringBetween(str, "<%=", "%>");

This requirs using "Apache commons lang" library: https://mvnrepository.com/artifact/org.apache.commons/commons-lang3/3.4

This library has a lot of useful methods for working with string, you will really benefit from exploring this library in other areas of your java code !!!

3 Comments

I seriously dont understand why do we need to add a new dependency jar/library all together in the project just to acheive a small trivial funcionality
Adding dependencies has become a joke :)
@Stunner, you are right, it's overkill. But this answer is mostly for those who already have commons-lang3 in their classpath.
8

Jlordo approach covers specific situation. If you try to build an abstract method out of it, you can face a difficulty to check if 'textFrom' is before 'textTo'. Otherwise method can return a match for some other occurance of 'textFrom' in text.

Here is a ready-to-go abstract method that covers this disadvantage:

  /**
   * Get text between two strings. Passed limiting strings are not 
   * included into result.
   *
   * @param text     Text to search in.
   * @param textFrom Text to start cutting from (exclusive).
   * @param textTo   Text to stop cuutting at (exclusive).
   */
  public static String getBetweenStrings(
    String text,
    String textFrom,
    String textTo) {

    String result = "";

    // Cut the beginning of the text to not occasionally meet a      
    // 'textTo' value in it:
    result =
      text.substring(
        text.indexOf(textFrom) + textFrom.length(),
        text.length());

    // Cut the excessive ending of the text:
    result =
      result.substring(
        0,
        result.indexOf(textTo));

    return result;
  }

2 Comments

This will not return multiple occurrences when there are multiple matches. Only the first one.
Though it returns only first match, fulfills my need.
5

Your regex looks correct, but you're splitting with it instead of matching with it. You want something like this:

// Untested code
Matcher matcher = Pattern.compile("<%=(.*?)%>").matcher(str);
while (matcher.find()) {
    System.out.println(matcher.group());
}

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.