1

I've read a bit about awk. It's proven to be extremely useful for single data Suppose I have two input files:

## inp1
x y
1 3
2 4
6 9
... 

## inp2
x z
1 5
2 19
6 9

I want to output something that 'combines' both files. Something like:

## output
x y z
1 3 5
2 4 19
6 9 9

I can think of ideas like interleaving these two files, like this: https://stackoverflow.com/questions/4011814/how-to-interleave-lines-from-two-text-files and doing something with awk.

Or maybe something using associative arrays? I'm not too sure however, which is the reason for this question ;).

I am using Linux.

2
  • 2
    There are some cases that should be explicitly stated, such as: what should happen if inp1 has a line for x=4 and inp2 does not, or vice versa; and if one file (or both) has two lines for x=7? Commented Jun 13, 2020 at 15:09
  • @Paul_Pedant I agree, and not only that. It was not explicitly stated if the files should be "combined" using the x column (or should it be the first column, regardless of the header?). I hope I am not being pedant, but I think otherwise the problem is not quite defined.
    – Quasímodo
    Commented Jun 13, 2020 at 22:00

3 Answers 3

4

Sounds like you're simply looking for join to join the files on the first field:

$ join -j1 file1 file2 
x y z
1 3 5
2 4 19
6 9 9

Note that join expects its input to be sorted, so you might need to do:

$ join -j1 <(sort file1) <(sort file2 )
1 3 5
2 4 19
6 9 9
x y z

However, that will screw up your header, so to avoid that you coud do:

$ join -j1 <(head -n1 file1) <(head -n2 file2); join -j1 <(tail -n+2 file1|sort) <(tail -n+2 file2 |sort )
x y z
1 3 5
2 4 19
6 9 9

And to save that to a new file:

( \
    join -j1 <(head -n1 file1) <(head -n2 file2); 
    join -j1 <(tail -n+2 file1|sort) <(tail -n+2 file2 |sort ) \
) > newFile

Alternatively, with awk:

$ awk 'NR==FNR{a[$1]=$2; next}{print $1,$2,a[$1]}' file2 file1 
x y z
1 3 5
2 4 19
6 9 9
1
  • I'd prefer the awk answer... Thanks!!
    – Suraaj K S
    Commented Jun 13, 2020 at 15:13
0

With the sample input you provided all you need is this if the fields are tab-separated:

$ paste file1 <(cut -f2 file2)
x   y   z
1   3   5
2   4   19
6   9   9

or this if separated by a single blank:

$ paste -d' ' file1 <(cut -d' ' -f2 file2)
x y z
1 3 5
2 4 19
6 9 9
0

Using awk:

  • Method 1
    awk 'FNR==NR{a[FNR]=$2;next}{print $1,a[FNR],$2}' f1 f2
    
  • Method 2
    awk 'NR==FNR{a[$1];a[FNR]=$2;next}($1 in a) {print $1,a[FNR],$2}' f1 f2
    

Output

x y z
1 3 5
2 4 19
6 9 9

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.