Looks like you want:
grep -Ex '[[:space:]]*my_name[[:space:]]+[0123456789]+[[:space:]]*' file.txt
- inside double quotes,
$(cmd)
which expands to the output of cmd
, called command substitution is still performed. Here, you're trying to get the output of the ?!.
command¹ which doesn't exist hence the error. As many regexp operators are characters that happen to also have a special meaning in the shell, best is to use single quotes to quote them as those escape every character for the shell.
\s
and (?!.)
are perl regexp operators. While some grep
implementations do support \s
these days (as a short form of the standard [[:space:]]
), they cannot support the (?!.)
negative look-ahead operator because (?!
is required to match a literal (?!
as it always since grep
was introduced in the early 70s. They could recognise it with -E
(introduced by POSIX in the early 90s) as (?...
is unspecified with standard E
xtended regexps (ERE). The ast-open implementation of grep
is one that does, but that's the only one I know. In any case, having anything after $
, which is meant to match at the end of the subject, does not make sense. Adding a quantifier (*
in your case) after a look-around operator also does not make sense.
+
is a ERE or Perl RE operator, not a BRE one. The BRE equivalent is \{1,\}
.
[0-9]
does match the 0123456789 ASCII arabic digits, but often many more (often number related) characters which happen to sort between 0 and 9. If, like when doing input validation, you only want to match on 0123456789, you need to specify the exact list.
- Adding
-x
to match the line as a whole saves having to use the ^
and $
anchors.
-i
is for case insensitive matching. With it, MY_NAME
, My_NaMe
would also be accepted.
- I used
-E
above just because +
is shorter to type than \{1,\}
which is just cosmetic. Using EREs here is not strictly necessary.
- Some implementations of
grep
do support a -P
option to match using Perl-like regular expressions. With those, you could do grep -Px '\s*my_name\s+\d+\s*'
. \d
is meant to match a decimal digit. In the version of GNU grep
I tried it on, it only matched on 0123456789, but I can't guarantee that will be the case with all versions and implementations. See for instance how grep -Px '(*UCP)\d'
(for U
nicode C
haracter P
roperties) matches many more decimal digit characters for instance. To be on the safe side, you may still want to use [0123456789]
there.
Another approach could be use use awk
and do:
awk 'NF == 2 && $1 == "my_name" && $2 ~ /^[0123456789]+$/'
To more clearly specify that you want lines that have two fields (where the N
umber of F
ields is 2), the first of which is my_name
, and the second made only of ASCII decimal digits. Fields by default are delimited with blanks (more like [[:blank:]]
) though some awk
implementations only consider space and tab and some also consider vertical spacing characters like [[:space:]]
/ \s
do.
For a case insensitive match, you'd do tolower($1) == "my_name"
. The GNU implementation of awk
can do case insensitive matching for all regexp matching by passing -v IGNORECASE=1
.
¹ Technically, ?!.
is a shell glob as ?
is a glob operator. So if there were files called a!.
and b!.
in the current working directory, that would be expanded to those and you'd be trying to execute the a!.
command with b!.
as argument. With saner shells such as zsh or fish (see also the failglob
option in bash
), you'd get an error when that ?!.
doesn't match any file.
$(?!.)*
? The reason for your error is because you used double quotes which allows the shell to interpret$(?!.)
using command substitution, but i don't understand it's purpose in your regex, it doesn't look like something a basicgrep
will understand.