Facing problem with awk use

Question

could you please help with awk or any other method..

Inputfile contains below..

PROD   192.168.100.25   Unix                       Active
PROD   192.168.100.26   Unix - Server              Active
DEV    192.168.100.27   windows Gateway            Active
TEST   192.168.100.28   Unix Test Server           Not Active
PROD   192.168.100.29   windows mail gateway       Active down 
PROD   192.168.100.30   Unix                       Active down

Now try awk to get column 2 and 4, see below

awk '{print $2  $4}' Inputfile

result-

192.168.100.25   Active                       
192.168.100.26   -               
192.168.100.27   Gateway             
192.168.100.28   Test           
192.168.100.29   mail        
192.168.100.30   Active

expected result-

192.168.100.25   Active                       
192.168.100.26   Active               
192.168.100.27   Active             
192.168.100.28   Not Active           
192.168.100.29   Active down        
192.168.100.30   Active down

what is your file field delimiter? Tab(s) or Space(s)? one or more? or if we know how those output is generarting we could also think to fix it from that way too — αғsнιη
– αғsнιη, Commented Nov 5, 2022 at 10:05
It should be easier if instead of using a space as delimiter you use a comma , as delimiter, i.e. the lines should look like this: PROD,192.168.100.29,windows mail gateway,Active down — Edgar Magallon
– Edgar Magallon, Commented Nov 5, 2022 at 10:13
Hi αғsнιη & Edgar, this file is a database file and file format cannot be changed, how to know the file delimiter is Tab(s) or Space(s)? — anukalps
– anukalps, Commented Nov 5, 2022 at 10:29
run cat -A fileNameHere. Also treatment you can use @ followed by name to ping someone, like @anukalps — αғsнιη
– αғsнιη, Commented Nov 5, 2022 at 19:32
Instead of tab or space delimited your data might also be fixed width fields - if so tell us how long the fields are. — Ed Morton
– Ed Morton, Commented Nov 6, 2022 at 16:41

ilkkachu · Accepted Answer · 2022-11-05 11:24:42Z

By default, fields are separated by runs of whitespace in AWK. It doesn't care how much white space there is, so a file like this would give the same result:

PROD   192.168.100.25   Unix   Active
PROD   192.168.100.26   Unix   -        Server   Active

As far as it's concerned, the fourth fields of those two lines are obviously Active and -.

Now, it's possible that your file format is actually such that the fields are separated by tabs, and only contain spaces in the middle. I don't think tabs survive posting in SE, and the spacing you show doesn't exactly match what you'd get with 8 columns wide tabs either, but I'll note this anyway.

Then, the lines would be more like

PROD<tab>192.168.100.26<tab>Unix - Server<tab><tab>Active

and you could tell AWK to use runs of tabs as separator like so:

$ awk -F '\t+' '{print $2, $4}' file.txt
192.168.100.25 Active
192.168.100.26 Active

Though, it'd be more common to have a single tab between each field, which would mean they wouldn't line up so nicely with variable-length data. Then you'd just use awk -F '\t'.

If the columns are actually fixed-width, you could use e.g. cut to pick the parts you need. This may involve manually counting the characters, though.

         1         2         3         4         5         6         7
1234567890123456789012345678901234567890123456789012345678901234567890
PROD   192.168.100.29   windows mail gateway       Active down

Taking the count from above, this might work:

$ cut -c8-23,52-70 file-fixed.txt
192.168.100.25  Active
192.168.100.26  Active
192.168.100.27  Active
192.168.100.28  Not Active
192.168.100.29  Active down 
192.168.100.30  Active down

At least GNU AWK also has support for fixed-width fields, but I haven't looked too much into it. See https://www.gnu.org/software/gawk/manual/html_node/Fixed-width-data.html

cut is not effectively working as it doesnot have fixed width — anukalps
– anukalps, Commented Nov 5, 2022 at 12:19
@anukalps, then I suppose you'll need to specify what the file format is exactly. You can edit your question to add details. — ilkkachu
– ilkkachu, Commented Nov 5, 2022 at 15:14

rexypoo · Accepted Answer · 2022-11-05 20:24:15Z

First off, to be clear, awk is doing exactly what you asked it to do. By default, it will separate fields on any whitespace, and since the columns of interest can contain space characters, every space denotes a new field to awk.

When you have two arbitrary string inputs, it gets pretty hard to separate fields after they've been parsed by the linux column command. In my experience it is better to call a more robust programming language such as Python. I believe the Python CSV library has tools to infer the format of fixed-width fields.

If you could change the last field to only use the options "Active" "Inactive" "Down" then you could just use the following:

awk '{ print $2 $NF }' Inputfile

$NF means "number of fields" and is equivalent to selecting the last field.

But, the last column of data may or may not contain a space, which breaks this syntax.

Alternatively, as others have pointed out, if the fields are tab delimited, you should be able to use:

awk -F $'\t' '{ print $2 $4 }' Inputfile

The -F flag specifies the field separator, and $'\t' is the tab character.

However, if the entries are not tab delimited, you probably need to do something specific to this data format. If there are other input files that are formatted differently, you might still see failures.

Here's an example that will use any instance of two or more space characters as a field separator:

awk -F ' {2,}' '{ print $2 $NF }' Inputfile

In this case our field separator is a regular expression. It amounts to "any instance of two or more space characters."

This should work for the example you've given, but if you needed a field other than the first or last field, you could still run into trouble when column 3 contains multiple spaces.

canupseq · Accepted Answer · 2022-11-06 11:39:16Z

0

To use awk in this case first replace the space in "Not Active" and also "Active down" by some other character:

sed 's/Not Active/Not_Active/g ; s/Active down/Active_down/g' inputfile

Then use awk and extract the second and last fields:

awk '{print $2,$NF}'

and finally restore those spaces:

sed 's/_//g'

Putting it all together the final command is:

sed 's/Not Active/Not_Active/g ; s/inputfile down/Active_down/g' active | awk '{print $2,$NF}' | sed 's/_/ /g'

answered Nov 6, 2022 at 11:39

canupseq

2,0041 gold badge6 silver badges21 bronze badges

Add a comment |

Stack Exchange Network

Facing problem with awk use

3 Answers 3

You must log in to answer this question.

Hot Network Questions

Facing problem with awk use

3 Answers 3

You must log in to answer this question.

Related

Hot Network Questions