I'm trying to understand why grep -w (version 3.1 of the GNU implementation) matches only the first occurrence of a certain pattern in a line.
Here's an example. I would expect that it would match n1, n2 and n3, but it matches only the first one.
$ echo 'n1=1 n2=2 n3=3' | grep -ow "n[0-9]=*"
n1
Or if I tell it to match only n2 or n3, again it matches the first one, and ignores n3.
$ echo 'n1=1 n2=2 n3=3' | grep -ow "n[23]=*"
n2
What am I missing here? Is there any explanation for this behavior, or is it some sort of bug in grep?
The idea is to match either:
- n[0-9]
n[0-9]preceded and followed by a non-word character. - A substring that begin with n[0-9]
n[0-9]followed by any number of=characters and ends with a non-word character.
What am I missing here? Is there any explanationSo for this behaviorinstance, orif the string is it some sort of bug in grep?n1=1 n2=== n3=3 n4== n5, the expected result should be:
n1
n2===
n3
n4==
n5
Clarification: I know that the goal can be achieved by grep -ow -e 'n[0-9]' -e "n[0-9]=*", but that's beside the point. The goal of the question is to understand how grep works.
Addition tests
If I add n<num>= to different places in the line (without a following word character after the equal sign), it will match those as well, but again it will ignore n3=3.
$ echo 'n1=1 n2= n3=3 n4=' | grep -ow "n[0-9]=*"
n1
n2=
n4=
Last thing that I've found is that if I add -P to interpret the pattern as a Perl-compatible regular expression, it doesn't seem to keep the -w description that says that the substring "must be either at the end of the line or followed by a non-word constituent character", since it matches n1= even though it's followed by the character 1, which is a word constituent character ("letters, digits, and the underscore").
$ echo 'n1=1 n2= n3=3 n4=' | grep -owP "n[0-9]=*"
n1=
n2
n3=
n4
So it seems that grep -wP searches for a word boundary at the end of the substring rather than a non-word constituent character. It seems equivalent to:
$ echo 'n1=1 n2= n3=3 n4=' | grep -o "\bn[0-9]=*\b"
n1=
n2
n3=
n4