3

I trying to build a command to grep common list of error keywords(e.g. bug occured!, error, exception), but need to exclude common keywords too (e.g. DEBUG tag) without throws the matched line. This command should robust enough to handle miscellaneous of source/log.

Let's say I have this source:

$ cat dummy.log 
12345   DEBUG   debug.log abc
!DEBUG
!bug
!debug
DEBUG noop
12345 DEBUG bug occured
please report BUG to me
the filename is critical_bug.log
bug should be fix.
noop
throws error
a stuff
b otherstuff
c otherstuff stuff

This command will not work because it excluded the bug lines(i.e. 12345 DEBUG bug occured) which contains DEBUG:

$ cat -v dummy.log | nl | grep -Ei 'bug|stuff|error' | grep -Evi 'DEBUG|otherstuff'
 3  !bug
 7  please report BUG to me
 8  the filename is critical_bug.log
 9  bug should be fix.
11  throws error
12  a stuff

Change the order of pipe also same as above:

$ cat -v dummy.log | nl | grep -Evi 'DEBUG|otherstuff' | grep -Ei 'bug|stuff|error'
 3  !bug
 7  please report BUG to me
 8  the filename is critical_bug.log
 9  bug should be fix.
11  throws error
12  a stuff

Try to use ^ in grep ([UPDATE] wrong, ^ is not for exclude), but it included the DEBUG noop which doesn't contains bug (note: all of the filter should case insensitive, e.g. I want to accept BUG occured! and exclude debug.log):

 $ cat -v dummy.log | nl | grep -Ei 'bug|stuff|error|^DEBUG|^otherstuff'
 1  12345   DEBUG   debug.log abc
 2  !DEBUG
 3  !bug
 4  !debug
 5  DEBUG noop
 6  12345 DEBUG bug occured
 7  please report BUG to me
 8  the filename is critical_bug.log
 9  bug should be fix.
11  throws error
12  a stuff
13  b otherstuff
14  c otherstuff stuff

I can't customized to exclude only debug if I only use -w (e.g. the filename is critical_bug.log failed to include):

$ grep -wnEi 'bug|stuff|error' dummy.log 
3:!bug
6:12345 DEBUG bug occured
7:please report BUG to me
9:bug should be fix.
11:throws error
12:a stuff
14:c otherstuff stuff

My expected output (Note: I need to keep matched color and original line number):

$ grep -wnEi 'bug|stuff|error' dummy.log 
3:!bug
6:12345 DEBUG bug occured
7:please report BUG to me
8:the filename is critical_bug.log
9:bug should be fix.
11:throws error
12:a stuff
14:c otherstuff stuff

Is it possible make this in grep or alternative command?

6
  • 2
    Does your grep have a -w (--word-regexp) option? Commented Aug 6, 2020 at 12:17
  • @steeldriver I think so, -w display in --help. Commented Aug 6, 2020 at 12:19
  • @steeldriver thanks, I noticed grep -Ei 'bug|stuff|error|^DEBUG|^otherstuff' -w achieve what I wanted. Commented Aug 6, 2020 at 12:22
  • @steeldriver And I noticed I must keep my | nl | , can't simply use grep -n to shows number of line. Commented Aug 6, 2020 at 12:31
  • 1
    please note ^ mean begin of line (not exclude) and grep work on files no need to pipe cat Commented Aug 6, 2020 at 12:35

1 Answer 1

4

Assuming GNU grep (the default on Linux) you could use PCRE mode and negative lookbehinds:

$ grep -niP '(?<!de)bug|(?<!other)stuff|error' dummy.log 
3:!bug
6:12345 DEBUG bug occured
7:please report BUG to me
8:the filename is critical_bug.log
9:bug should be fix.
11:throws error
12:a stuff
14:c otherstuff stuff

The options used are:

-n, --line-number
    Prefix each line of output with the 1-based line number within 
        its input file.

-i, --ignore-case
    Ignore case distinctions in patterns and input data, so that
    characters that differ only in case match each other.

-P, --perl-regexp
    Interpret  PATTERNS  as  Perl-compatible regular expressions (PCREs).
    This option is experimental when combined with the  -z  (--null-data)
    option, and grep -P may warn of unimplemented features.

The magic happens in the lookbehind. The general format is (?!<foo)bar and this means "match bar but only if it isn't preceded by foo". So (?<!de)bug will match bug unless it comes after de and (?<!other)stuff will match stuff unless it comes after other.

11
  • Thanks, but this doesn't work if a line containing _bug.log. I want to customize the text I don't want, Commented Aug 6, 2020 at 12:53
  • @Fruit no, but why should it? Your question is looking for the work bug by itself. I that's not the case, please edit your question and clarify. What should happen with _bug.log? Should that line be kept? Should it be skipped? This is giving the exact same output as your example, so if this isn't actually what you want, you need to let us know. Commented Aug 6, 2020 at 12:54
  • Thanks, I added my expected output. Commented Aug 6, 2020 at 13:07
  • 2
    @alecxs with GNU grep and PCREs, you can use negative lookbehinds: printf 'debug\na bug\nbugles\njitterbugger\n' | grep -P '(?<!de)bug' Commented Aug 6, 2020 at 13:44
  • 1
    @Fruit oh, cool! I didn't think you could give multiple options of different length like that! Commented Aug 6, 2020 at 15:16

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.