Style
- Use 'final' keyword where you can. It enables the JIT compiler to make better choices when optimizing, and it also ensures that bugs don't creep in to your code. For example,
private static int getNonStarIndex(String regex, int starIndex) {...} should be private static final int getNonStarIndex(final String regex, final int starIndex) {...} and public static boolean match(String regex, String str) {...} should be public static final boolean match(final String regex, final String str) {...}
- Use better names for non-for-loop variables.
i is OK for for (int i = 0; i < ...; i++) because this is a clear 'standard', but using i and j are poor choices for variables that are manipulated inside the loops. rexpos and strpos are examples of names I would use.
- I always put '1-line-blocks' on their own line, but I accept that
if (regex == null) throw new NullPointerException("The regex cannot be null"); type lines are OK. But, the following 1-liner is never OK: if (regex.charAt(nextNonStarIndex) == DOT) {i = nextNonStarIndex + 1; j++; continue;}
Functional
More detailed assessment.... I added the following tests to your code:
System.out.println("Should be false from here");
System.out.println(Regex.match("abc", "abcd"));
System.out.println(Regex.match("*a", "abcd"));
System.out.println(Regex.match("a", ""));
System.out.println(Regex.match(".a*c", "abc"));
System.out.println(Regex.match("a.*b", "abc"));
System.out.println(Regex.match("..", "abc"));
System.out.println(Regex.match("", ""));
System.out.println(Regex.match("", "abc"));
and discovered that your code fails for Regex.match("", ""). This is just bad form. The 'assignment' gave a whole bunch of positive and negative test cases, but you only chose to test the positive ones. That counts as 'half-a-job' when testing is concerned. It took me a few minutes to copy/paste the tests in and found what should be a 'fail-the-assignment' condition.
Then I looked at your algorithm and figured it was 'passing' early tests too early... let me try to explain.... consider the following:
Regex.match("a*.b.*c", "axxbxxbxxc");
The above is not one of the official test cases, but, it should return true, but your code returns false.
This is because you look for only the most 'obvious' match...
Recommendations
The algorithm you have chosen is too simplistic, having only 1 loop inside the 'STAR' handling loop.
I re-worked your code to use recursion instead, and it handles things much better. Without giving the whole algorithm away, consider a recursive function (copy/paste of your code, renamed match to be matchRecursive):
public static boolean matchRecursive(final String regex, int regexpos,
final String str, int strpos) {
while ((regexpos < regex.length() && strpos < str.length())) {
if (regex.charAt(regexpos) == str.charAt(strpos) || regex.charAt(regexpos) == DOT) {
regexpos++;
strpos++;
} else if (regex.charAt(regexpos) == STAR) {
regexpos++;
// see if there's anything remaining in the regex.
// if not, return true (star matches rest of str).
// otherwise, recursively check if the rest of the regex
// matches anywhere in the rest of the str....
while (strpos < str.length()) {
if (matchRecursive(regex, regexpos, str, strpos) {
return true;
}
strpos++;
}
} else {
return false;
}
}
// tidy up odd use-cases like:
// regexes ending in * when there's nothing left in str.
// empty regex
// etc.
return ....;
}