I'm writing a Scala parser combinator for a personal project, and wanted to take some ideas from the scala-parser-combinators project. One of the classes is RegexParsers, which defines the input type as Char, and an implicit conversion from String to Parser[String], so, that string literals can be used as parser combinators.
/** A parser that matches a literal string */
implicit def literal(s: String): Parser[String] = new Parser[String] {
def apply(in: Input) = {
val source = in.source
val offset = in.offset
val start = handleWhiteSpace(source, offset)
var i = 0
var j = start
while (i < s.length && j < source.length && s.charAt(i) == source.charAt(j)) {
i += 1
j += 1
}
if (i == s.length)
Success(source.subSequence(start, j).toString, in.drop(j - offset), None)
else {
val found = if (start == source.length()) "end of source" else "'"+source.charAt(start)+"'"
Failure("'"+s+"' expected but "+found+" found", in.drop(start - offset))
}
}
}
Note the error reporting at the end. I'm confused why the index of the mismatch is reported as start. If the string is "apple", and the input "...apply...", the mismatch is at the character 'y', which is pointed to by the variable j, not at the character 'a', which is pointed to by start.
Is this a bug?
Edit:
Just ran a test, and this is a confirmed bug. Opened https://github.com/scala/scala-parser-combinators/issues/573.
println(parse("apple", "apply"))
Failure('apple' expected but 'a' found,CharSequenceReader('a', ...))