0

I am trying to grab my_name 369 from file.txt I input this grep line with regex but it return an error. I have tried using egrep and it did not work.

Input:

  1. my_name 369 == match
  2. my_name 161 == match
  3. my_name 123 2 != error

Error: ?!.: command not found

This is my code to execute and it worked on regex101

grep -i "^\s*my_name\s+[0-9]+\s*$(?!.)*" file.txt
5
  • 2
    What is this part trying to do? $(?!.)*? The reason for your error is because you used double quotes which allows the shell to interpret $(?!.) using command substitution, but i don't understand it's purpose in your regex, it doesn't look like something a basic grep will understand.
    – bxm
    Commented Aug 17, 2022 at 6:15
  • 1
    If you describe a bit more what you are trying to achieve, including sample input and desired output it will be easier to offer a solution.
    – bxm
    Commented Aug 17, 2022 at 6:17
  • == my_name 369 and != if my_name 369 2r@
    – JakePaul
    Commented Aug 17, 2022 at 6:20
  • 1
    Please edit your question to provide sample input and corresponding expected output. Don't reply in the comments. Commented Aug 17, 2022 at 6:34
  • Please add detail to your question rather than via comments. A sample input file and what you want from it would be best, rather than a couple of terms. Does your input have multiple lines? Do you want to extract exactly matching lines or the matching part of any line? Do you expect just one thing to to be returned from your command?
    – bxm
    Commented Aug 17, 2022 at 6:39

1 Answer 1

1

Looks like you want:

grep -Ex '[[:space:]]*my_name[[:space:]]+[0123456789]+[[:space:]]*' file.txt
  • inside double quotes, $(cmd) which expands to the output of cmd, called command substitution is still performed. Here, you're trying to get the output of the ?!. command¹ which doesn't exist hence the error. As many regexp operators are characters that happen to also have a special meaning in the shell, best is to use single quotes to quote them as those escape every character for the shell.
  • \s and (?!.) are perl regexp operators. While some grep implementations do support \s these days (as a short form of the standard [[:space:]]), they cannot support the (?!.) negative look-ahead operator because (?! is required to match a literal (?! as it always since grep was introduced in the early 70s. They could recognise it with -E (introduced by POSIX in the early 90s) as (?... is unspecified with standard Extended regexps (ERE). The ast-open implementation of grep is one that does, but that's the only one I know. In any case, having anything after $, which is meant to match at the end of the subject, does not make sense. Adding a quantifier (* in your case) after a look-around operator also does not make sense.
  • + is a ERE or Perl RE operator, not a BRE one. The BRE equivalent is \{1,\}.
  • [0-9] does match the 0123456789 ASCII arabic digits, but often many more (often number related) characters which happen to sort between 0 and 9. If, like when doing input validation, you only want to match on 0123456789, you need to specify the exact list.
  • Adding -x to match the line as a whole saves having to use the ^ and $ anchors.
  • -i is for case insensitive matching. With it, MY_NAME, My_NaMe would also be accepted.
  • I used -E above just because + is shorter to type than \{1,\} which is just cosmetic. Using EREs here is not strictly necessary.
  • Some implementations of grep do support a -P option to match using Perl-like regular expressions. With those, you could do grep -Px '\s*my_name\s+\d+\s*'. \d is meant to match a decimal digit. In the version of GNU grep I tried it on, it only matched on 0123456789, but I can't guarantee that will be the case with all versions and implementations. See for instance how grep -Px '(*UCP)\d' (for Unicode Character Properties) matches many more decimal digit characters for instance. To be on the safe side, you may still want to use [0123456789] there.

Another approach could be use use awk and do:

awk 'NF == 2 && $1 == "my_name" && $2 ~ /^[0123456789]+$/'

To more clearly specify that you want lines that have two fields (where the Number of Fields is 2), the first of which is my_name, and the second made only of ASCII decimal digits. Fields by default are delimited with blanks (more like [[:blank:]]) though some awk implementations only consider space and tab and some also consider vertical spacing characters like [[:space:]] / \s do.

For a case insensitive match, you'd do tolower($1) == "my_name". The GNU implementation of awk can do case insensitive matching for all regexp matching by passing -v IGNORECASE=1.


¹ Technically, ?!. is a shell glob as ? is a glob operator. So if there were files called a!. and b!. in the current working directory, that would be expanded to those and you'd be trying to execute the a!. command with b!. as argument. With saner shells such as zsh or fish (see also the failglob option in bash), you'd get an error when that ?!. doesn't match any file.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.