Trying to parse output of "lastb" command via awk, but when the userID field is blank, my variables are thrown off

Question

I'm trying to pull the date and IP address from the output of lastb

I'm using: lastb | awk '{print $5,$6,$7,$3}'

The problem is that sometimes the first column (userID) of lastb is blank:

Data example:

wpsadmin ssh:notty    213.109.202.127  Tue Oct  1 12:07 - 12:07  (00:00)
         ssh:notty    8.219.222.66     Tue Oct  1 11:48 - 11:48  (00:00)
quser    ssh:notty    213.109.202.127  Tue Oct  1 11:03 - 11:03  (00:00)
udatabas ssh:notty    139.19.117.130   Tue Oct  1 10:34 - 10:34  (00:00)
Admin    ssh:notty    213.109.202.127  Tue Oct  1 09:58 - 09:58  (00:00)
         ssh:notty    79.110.62.93     Tue Oct  1 09:40 - 09:40  (00:00)
udatabas ssh:notty    139.19.117.130   Tue Oct  1 09:34 - 09:34  (00:00)

...which throws off my awk variables by one.

So using: lastb | awk '{print $5,$6,$7,$3}' for the dataset above gives me:

Oct 1 12:07 213.109.202.127
1 11:48 - Tue
Oct 1 11:03 213.109.202.127
Oct 1 10:34 139.19.117.130
Oct 1 09:58 213.109.202.127
1 09:40 - Tue
Oct 1 09:34 139.19.117.130

How do I rectify this? Thanks!

Ed Morton · Accepted Answer · 2024-11-09 12:02:52Z

4

Just count the field numbers you want to print from the end (NF) instead of from the start of the fields, e.g. using any awk:

$ lastb | awk '{print $(NF-5), $(NF-4), $(NF-3), $(NF-7)}'
Oct 1 12:07 213.109.202.127
Oct 1 11:48 8.219.222.66
Oct 1 11:03 213.109.202.127
Oct 1 10:34 139.19.117.130
Oct 1 09:58 213.109.202.127
Oct 1 09:40 79.110.62.93
Oct 1 09:34 139.19.117.130

edited Nov 9, 2024 at 12:02

answered Nov 9, 2024 at 11:57

Ed Morton

34.6k6 gold badges24 silver badges55 bronze badges

1

Thank you Ed! This is the solution I ended up going with, but I assigned the "answer" to Stéphane Chazelas for all the additional information in that answer regarding formatting and the upcoming move away from lastb
– ZacWolf
Commented Nov 10, 2024 at 21:29
@ZacWolf you're welcome, glad you got an answer.
– Ed Morton
Commented Nov 10, 2024 at 21:43

Add a comment |

Stéphane Chazelas · Accepted Answer · 2024-11-10 22:14:04Z

In awk, the default value of FS is a single space character, and that value has a very special meaning. It means fields are separated by sequences of one or more blank characters¹ but also that leading and trailing blanks are ignored.

Here you want to not do that second part, for which you can do:

$ lastb | awk -F '[[:blank:]]+' '{print $5,$6,$7,$3}'
Oct 1 12:07 213.109.202.127
Oct 1 11:48 8.219.222.66
Oct 1 11:03 213.109.202.127
Oct 1 10:34 139.19.117.130
Oct 1 09:58 213.109.202.127
Oct 1 09:40 79.110.62.93
Oct 1 09:34 139.19.117.130

Here, FS (as set with the -F option) being [[:blank:]]+, means that any sequence of one or more blanks constitutes a Field Separator, including the ones at the start of the line. If a line starts with a blank, that means $1 will be the empty string before those blanks.

That assumes the first 3 fields don't contain whitespace. In practice, that can't be guaranteed. I find that with failed ssh login attempts, not even newlines are escaped, so it's hard to parse that output reliably.

I can do ssh -l $'a b\nc' localhost and the lastb output will have:

a b
c    ssh:notty    127.0.0.1        Sun Nov 10 14:57 - 14:57  (00:00)
chazelas seat0        login screen     Fri Oct 25 08:19 - 08:19  (00:00)

(see also that genuine failed attempt where the third field login screen contains a space).

I cannot however cause a : to be included in the username field.

So @EdMorton's approach to count fields from the end would be better here, and made even more reliable if you check that the line contains [[:blank:]]ssh: to filter out entries not by sshd, or the first lines of the username for those usernames that contain newlines.

lastb | awk '/[[:blank:]]ssh:/ {print $(NF-5), $(NF-4), $(NF-3), $(NF-7)}'

FYI, the lastb utility is already gone from Debian since May 2024 (which seems premature to me as most things still log entries into wtmp/btmp and few things yet in wtmpdb; seems it's part of the rush to fix the Y2038 problem) so relying on it may not be future-proof, see the related NEWS entry in the util-linux Debian package:

util-linux (2.40.1-2) unstable; urgency=medium

last(1) has been split off to the wtmpdb package. If you find last(1) useful, please install wtmpdb and accept the default PAM configuration changes from libpam-wtmpdb.

lastb(1) is removed. Please see syslog/journal for failed login attempts.

-- Chris Hofstaedtler <[email protected]> Wed, 29 May 2024 23:52:19 +0200

(you also need to install libpam-wtmpdb which is only recommended by wtmpdb, not a hard dependency).

Getting timestamp (in standard parsable format with microsecond precision as a bonus²) and from address for failed ssh authentication attempts from journalctl could look like:

journalctl -qaro short-iso-precise --no-hostname -u ssh.service \
  -g '^Failed .* user .* from [\d.]+ port \d+ ssh\d*\z' |
  sed -E 's/ .* from( [^ ]+).*/\1/'

Or for some JSON output for easier post-processing:

journalctl -qaro json \
  --output-fields=_SOURCE_REALTIME_TIMESTAMP,MESSAGE \
  -u ssh.service \
  -g '^Failed .* user .* from [\d.]+ port \d+ ssh\d*\z' |
  jq -c '
    {
      "ts": (
        ._SOURCE_REALTIME_TIMESTAMP as $t |
        $t[0:-6] |
        tonumber |
        strflocaltime("%FT%T." + $t[-6:] + "%z")
      )
    } + (
      .MESSAGE |
        capture(" user (?<user>.*) from (?<ip>[^ ]+) port (?<port>\\d+)")
    ) |
     .port |= tonumber'

Which gives something like:

{"ts":"2024-11-10T19:26:12.952796+0000","user":"\\377\\377\\377\\377\\377\\377\\377\\377","ip":"127.0.0.1","port":42126}
{"ts":"2024-11-10T19:23:52.749940+0000","user":"\\001\\002\\003\\004\\005\\006\\r\\v\\t","ip":"127.0.0.1","port":47172}
{"ts":"2024-11-10T19:18:57.094019+0000","user":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","ip":"127.0.0.1","port":59338}
{"ts":"2024-11-10T17:11:17.574040+0000","user":"\\351","ip":"127.0.0.1","port":46272}
{"ts":"2024-11-10T14:58:46.607290+0000","user":"a b c d e f\\nc","ip":"127.0.0.1","port":57374}
{"ts":"2024-11-10T14:57:31.148206+0000","user":"a b\\nc","ip":"127.0.0.1","port":35566}
{"ts":"2024-11-10T14:56:40.154428+0000","user":"a","ip":"127.0.0.1","port":51774}
{"ts":"2024-11-10T14:55:57.128920+0000","user":"foo bar\\nbaz","ip":"127.0.0.1","port":39012}
{"ts":"2024-11-10T14:50:07.156848+0000","user":"foo\\nbar","ip":"127.0.0.1","port":55176}
{"ts":"2024-11-10T14:48:52.938861+0000","user":"root ssh ssh         127.0.0.1        Sun Nov 10 14.38 _ 14.38   00.00\\nbar","ip":"127.0.0.1","port":38166}
{"ts":"2024-11-10T14:34:17.532900+0000","user":"root ssh","ip":"127.0.0.1","port":55498}
{"ts":"2024-11-10T14:29:01.636760+0000","user":"x\\ny","ip":"127.0.0.1","port":54132}
{"ts":"2024-11-10T14:28:43.724771+0000","user":"foo bar","ip":"127.0.0.1","port":37306}
{"ts":"2024-11-09T15:40:59.024960+0000","user":"stephane","ip":"172.17.27.2","port":52536}
{"ts":"2024-11-09T15:40:35.582616+0000","user":"qweq","ip":"172.17.27.2","port":54234}

(also using _SOURCE_REALTIME_TIMESTAMP to get the time when the event was generated, rather than received).

Where you see my previous attempts at fooling lastb being handled in a more useful way than by lastb, even lastb --time-format=iso -w.

^{¹ per POSIX, that's newline or any character for which iswblank() returns true but you'll find in some awk implementations, that's any character for which iswspace() returns true (space being a superset of blank which includes newline and other vertical whitespace ones such as carriage return, form feed, vertical tab...). For those that don't support multibyte encodings, it's only the single byte ones (isblank()/isspace()).}

^{² See also lastb --time-format=iso to get that format with lastb though without subsecond precision.}

Prabhjot Singh · Accepted Answer · 2024-11-09 11:00:18Z

1

Using awk:

$ awk '{split($0,a,/ssh:[^[:space:]]+/);$0= a[2]}{print $3,$4,$5,$1}'

$ awk '$1 !~ /ssh:/{$1="";$0=$0}{print $4,$5,$6,$2}'

edited Nov 9, 2024 at 11:00

answered Nov 9, 2024 at 8:58

Prabhjot Singh

2,2101 gold badge5 silver badges20 bronze badges

This will fail when user Sam Shade (username sshade) shows up.
– G-Man Says 'Reinstate Monica'
Commented Nov 9, 2024 at 10:38
@G-ManSays'ReinstateMonica' Yes. I've edited the command. Is it correct now?
– Prabhjot Singh
Commented Nov 9, 2024 at 11:01

Add a comment |

markp-fuso · Accepted Answer · 2024-11-08 23:41:53Z

To simulate OP's lastb call:

$ cat lastb.out
wpsadmin ssh:notty    213.109.202.127  Tue Oct  1 12:07 - 12:07  (00:00)
         ssh:notty    8.219.222.66     Tue Oct  1 11:48 - 11:48  (00:00)
quser    ssh:notty    213.109.202.127  Tue Oct  1 11:03 - 11:03  (00:00)
udatabas ssh:notty    139.19.117.130   Tue Oct  1 10:34 - 10:34  (00:00)
Admin    ssh:notty    213.109.202.127  Tue Oct  1 09:58 - 09:58  (00:00)
         ssh:notty    79.110.62.93     Tue Oct  1 09:40 - 09:40  (00:00)
udatabas ssh:notty    139.19.117.130   Tue Oct  1 09:34 - 09:34  (00:00)

One general approach:

we need to determine if we should print fields 5,6,7,3 or fields 4,5,6,2
5,6,7,3 is the same thing as (5-0),(6-0),(7-0),(3-0)
4,5,6,2 is the same thing as (5-1),(6-1),(7-1),(3-1)
we can generalize this to (5-offset),(6-offset),(7-offset),(3-offset) where offset is 0 or 1
we can test attributes of the 1st field to determine the value of offset

A couple awk ideas based on different attributes of the 1st field:

########
# determine offset by contents of 1st field

awk '
{ offset = ( $1 ~ /ssh:/ ) ? 1 : 0                       # if 1st field contains string "ssh:" then offset = 1, else offset = 0
  print $(5-offset),$(6-offset),$(7-offset),$(3-offset)  # apply offset to determine which fields to print
}'

########
# determine offset by position of 1st field in the line

awk '
{ offset = ( index($0,$1) != 1 ) ? 1 : 0                 # if 1st field does not start in column 1 then offset = 1, else offset = 0
  print $(5-offset),$(6-offset),$(7-offset),$(3-offset)
}'

Another approach would be to insert a 'placeholder' field at the front of the line when the 1st field is 'missing'.

One awk idea using the index($0,$1) attribute:

awk '
{ $1 = (index($0,$1)==1 ? "" : "placeholder" FS) $1    # if 1st field starts in column 1 then no placeholder is needed otherwise prepend the 1st field with the string "placeholder" 
  $0=$0                                                # force awk to reparse the new line; since all lines now have the same number of fields ...
  print $5,$6,$7,$3                                    # there's no need for an offset
}'

Removing comments, collapsing into one-liners, and reading lastb output from stdin:

cat lastb.out | awk '{offset = ($1~/ssh:/) ? 1 : 0; print $(5-offset),$(6-offset),$(7-offset),$(3-offset)}'

cat lastb.out | awk '{offset = (index($0,$1)!=1) ? 1 : 0; print $(5-offset),$(6-offset),$(7-offset),$(3-offset)}'

cat lastb.out | awk '{$1 = (index($0,$1)==1 ? "" : "placeholder" FS) $1; $0=$0; print $5,$6,$7,$3}'

All of these generate:

Oct 1 12:07 213.109.202.127
Oct 1 11:48 8.219.222.66
Oct 1 11:03 213.109.202.127
Oct 1 10:34 139.19.117.130
Oct 1 09:58 213.109.202.127
Oct 1 09:40 79.110.62.93
Oct 1 09:34 139.19.117.130

Instead of testing attributes of the 1st field you could (at least for the sample data we've been provided) test the number of fields (NF):

NF==9 ==> 9 fields: we're 'missing' the 1st field so we set offset=1 or prepend placeholder at the front of the line
NF==10 ==> 10 fields: we set offset=0 or prepend nothing at the front of the line

awk 'booleans' including ~ and != already are 1 or 0 (like in C) so the ?1:0 is unnecessary. And a simple way to test if the line (record) begins with a space is /^ /. — dave_thompson_085, Commented Nov 11, 2024 at 0:30

Stack Exchange Network

Trying to parse output of "lastb" command via awk, but when the userID field is blank, my variables are thrown off

4 Answers 4

You must log in to answer this question.

Hot Network Questions

Trying to parse output of "lastb" command via awk, but when the userID field is blank, my variables are thrown off

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions