0

i have some complicated log and need grouping by minutes and value. below some sample log :

2019-08-09T19:01:53:594+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test1","logTime":"2019-08-09T12:01:53.594Z","responseTime":4}
2019-08-09T19:01:53:673+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test2","logTime":"2019-08-09T12:01:53.673Z","responseTime":4}
2019-08-09T19:14:03:773+07:00 - info: {"tag":"request /validate","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"error internal"},"metadata":"test3","logTime":"2019-08-09T12:14:03.773Z","responseTime":7}
2019-08-09T19:19:32:925+07:00 - info: {"tag":"request /validate","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"error internal"},"metadata":"test4","logTime":"2019-08-09T12:19:32.925Z","responseTime":8}

my expectation is like below :

19:01  errMessage : connect ECONNREFUSED 127.0.0.1:7000  10
19:02  errMessage : error internal  20
19:03  errMessage : error internal  10

noted :

19:01 = hour minutes
errMessage : error internal = value
20 = count message err

i already tried with this awk below but still not completed result with grouping

cat file.log | strings | grep "errMessage" | awk -F'[{,]' '{print $1,$3,$4,$5,$8}' | awk -F'[-,"]' '{print $3,$11,$12,$13,$15,$16,$17}' 

could u please help me to find the result how to grouping result by timestamp and count value ?

thanks

11
  • 3
    How is the result arrived at, e.g. that 00:11 entry expectation comes from where ?
    – steve
    Commented Aug 9, 2019 at 18:09
  • 1
    And the 10, 20, 10 at the end of the lines? Commented Aug 9, 2019 at 19:25
  • 1
    1. why use strings? is file.log not a text file? 2. you don't need either cat or grep if you're using awk. 3. piping the output of awk into another awk is almost always a sign that you are "doing it wrong" - in this case, you're outputting 5 fields from the first awk and then the second awk is printing fields 3,11,12,13,...17 - that's never going to work.
    – cas
    Commented Aug 10, 2019 at 0:52
  • 1
    Almost certainly. I started on a solution, but couldn't figure out what you meant by errMessage : error internal = value 20 = count message err and gave up. it seems like you're simultaneously saying each different error message has a specific value AND it's a counter for the number of times a given error message is seen. Either interpretation would require a completely different approach to the problem - I'm not telepathic and my time is limited.
    – cas
    Commented Aug 10, 2019 at 1:35
  • 2
    Please make sure that the expected output you post is the output you expect from the input you post, not the output you'd get given some different input. The more clear and simple your question is the better chance people will be willing/able to help you and so you'll get a good answer.
    – Ed Morton
    Commented Aug 10, 2019 at 17:11

1 Answer 1

0

Since the data given in the problem is a little bit too sparse, I extended it. This allows us to better demonstrate/verify the grouping logic by minute and error message, and counting.

2019-08-09T19:02:00:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test1","logTime":"2019-08-09T12:02:00.000Z","responseTime":4}
2019-08-09T19:02:03:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test1","logTime":"2019-08-09T12:02:00.000Z","responseTime":4}
2019-08-09T19:02:10:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test2","logTime":"2019-08-09T12:02:10.000Z","responseTime":4}
2019-08-09T19:02:15:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Internal Server Error","data":{},"errMessage":"TypeError: Cannot read property 'name' of undefined"},"metadata":"test2","logTime":"2019-08-09T12:02:10.000Z","responseTime":10}
2019-08-09T19:02:20:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test3","logTime":"2019-08-09T12:02:20.000Z","responseTime":4}
2019-08-09T19:02:25:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test3","logTime":"2019-08-09T12:02:20.000Z","responseTime":4}
2019-08-09T19:02:30:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test4","logTime":"2019-08-09T12:02:30.000Z","responseTime":4}
2019-08-09T19:02:35:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Internal Server Error","data":{},"errMessage":"ReferenceError: foo is not defined"},"metadata":"test4","logTime":"2019-08-09T12:02:30.000Z","responseTime":20}
2019-08-09T19:02:40:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test5","logTime":"2019-08-09T12:02:40.000Z","responseTime":4}
2019-08-09T19:02:45:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test5","logTime":"2019-08-09T12:02:40.000Z","responseTime":4}
2019-08-09T19:02:50:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test6","logTime":"2019-08-09T12:02:50.000Z","responseTime":4}
2019-08-09T19:02:55:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test6","logTime":"2019-08-09T12:02:50.000Z","responseTime":4}
2019-08-09T19:03:00:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test7","logTime":"2019-08-09T12:03:00.000Z","responseTime":4}
2019-08-09T19:03:05:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test7","logTime":"2019-08-09T12:03:00.000Z","responseTime":4}
2019-08-09T19:03:10:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test8","logTime":"2019-08-09T12:03:10.000Z","responseTime":4}
2019-08-09T19:03:15:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Internal Server Error","data":{},"errMessage":"SyntaxError: Unexpected token ':'"},"metadata":"test8","logTime":"2019-08-09T12:03:10.000Z","responseTime":15}
2019-08-09T19:03:20:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test9","logTime":"2019-08-09T12:03:20.000Z","responseTime":4}
2019-08-09T19:03:25:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test9","logTime":"2019-08-09T12:03:20.000Z","responseTime":4}
2019-08-09T19:03:30:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test10","logTime":"2019-08-09T12:03:30.000Z","responseTime":4}
2019-08-09T19:03:35:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Internal Server Error","data":{},"errMessage":"TypeError: Cannot read property 'length' of undefined"},"metadata":"test10","logTime":"2019-08-09T12:03:30.000Z","responseTime":25}
2019-08-09T19:03:45:000+07:00 - info: {"tag":"request /browse","tagName":{"status":"05","message":"Unrecognise Request","data":{},"errMessage":"connect ECONNREFUSED 127.0.0.1:7000"},"metadata":"test11","logTime":"2019-08-09T12:03:40.000Z","responseTime":4}

With that, my TXR Lisp program produces this:

$ txr group.tl  < xlog
19:02  errMessage : connect ECONNREFUSED 127.0.0.1:7000  10
19:02  errMessage : ReferenceError: foo is not defined  1
19:02  errMessage : TypeError: Cannot read property 'name' of undefined  1
19:03  errMessage : connect ECONNREFUSED 127.0.0.1:7000  7
19:03  errMessage : SyntaxError: Unexpected token ':'  1
19:03  errMessage : TypeError: Cannot read property 'length' of undefined  1

The program in group.tl is:

(defstruct log ()
  hour minute
  err-message)

(build
  (let (minute-batch cur-hh cur-mm)
    (flet ((flush ()
             (let ((by-message-hash (group-by .err-message (get))))
               (oust)
               (dohash (msg group by-message-hash)
                 (let ((lead (first group)))
                   (put-line `@{lead.hour}:@{lead.minute}  errMessage : @msg  @(len group)`))))))
      (whilet ((line (get-line)))
        (when-match `@{nil}T@hh:@mm:@nil - info: @json` line
          (let ((jobj (get-json json)))
            (if (or (nequal hh cur-hh)
                    (nequal mm cur-mm))
              (flush))
            (set cur-hh hh cur-mm mm)
            (add (new log
                      hour hh minute mm
                      err-message [[jobj "tagName"] "errMessage"])))))
      (flush))))

The program consists of two top-level forms:

  • a defstruct which defines the log structure which holds information about a time and error message.
  • a build expression which contains all the logic.

The build macro encloses an environment for the procedural construction and manipulation of a list. It provides a number of local operators. Here, we use three of them: add adds an item to the implicit list. get retrieves the list. oust replaces the list with another one (empty by default, if no arguments given).

The strategy is to scan the data and convert each line into a log object. Whenever the hour or minute changes, we call the local flush function to process the accumulated group. flush uses (get) to retrieve the list of accumulated log object, and (oust) to clear that list out. We also call flush one more time when we run out of input to get the last batch out.

The flush function groups the objects by error message to form a hash table and then dumps the info: for each error message it prints the time, the error message, and the size of the group sharing that error message.

Parsing the entries is done using pattern matching. Furthermore, we extract the JSON part as one unit and use the JSON parser in TXR Lisp. That produces JSON objects as hash tables; we walk the two levels of table to get to errMessage.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.