A better paste command

Question

I have the following two files ( I padded the lines with dots so every line in a file is the same width and made file1 all caps to make it more clear).

contents of file1:

ETIAM......
SED........
MAECENAS...
DONEC......
SUSPENDISSE

contents of file2

Lorem....
Proin....
Nunc.....
Quisque..
Aenean...
Nam......
Vivamus..
Curabitur
Nullam...

Notice that file2 is longer than file1.

When I run this command:

paste file1 file2

I get this output

ETIAM...... Lorem....
SED........ Proin....
MAECENAS... Nunc.....
DONEC...... Quisque..
SUSPENDISSE Aenean...
    Nam......
    Vivamus..
    Curabitur
    Nullam...

What can I do for the output to be as follows ?

ETIAM...... Lorem....
SED........ Proin....
MAECENAS... Nunc.....
DONEC...... Quisque..
SUSPENDISSE Aenean...
            Nam......
            Vivamus..
            Curabitur
            Nullam...

I tried

paste file1 file2 | column -t

but it does this:

ETIAM......  Lorem....
SED........  Proin....
MAECENAS...  Nunc.....
DONEC......  Quisque..
SUSPENDISSE  Aenean...
Nam......
Vivamus..
Curabitur
Nullam...

non as ugly as the original output but wrong column-wise anyway.

paste is using tabs in front of the lines from second file. You may have to use a postprocessor to align the columns appropriately. — unxnut
– unxnut, Commented Nov 5, 2013 at 14:17
paste file[12] | column -s $'\t' -t -o ' ' or have I missed something? — user380915
– user380915, Commented Feb 24, 2021 at 17:03

Mark Plotnick · Accepted Answer · 2018-12-07 16:07:32Z

21

Assuming you don't have any tab characters in your files,

paste file1 file2 | expand -t 13

with the arg to -t suitably chosen to cover the desired max line width in file1.

OP has added a more flexible solution:

I did this so it works without the magic number 13:

paste file1 file2 | expand -t $(( $(wc -L <file1) + 2 ))

It's not easy to type but can be used in a script.

edited Dec 7, 2018 at 16:07

answered Nov 5, 2013 at 15:02

Mark Plotnick

26.1k3 gold badges68 silver badges82 bronze badges

nice! I didn't know about expand before I read your answer :)

TabeaKischka
– TabeaKischka

2018-12-07 14:30:41 +00:00
Commented Dec 7, 2018 at 14:30

Add a comment |

phuclv · Accepted Answer · 2025-11-23 15:16:35Z

I thought awk might do it nicely, so I googled "awk reading input from two files" and found an article on stackoverflow to use as a starting point.

First is the condensed version, then fully commented below that. This took a more than a few minutes to work out. I'd be glad of some refinements from smarter folks.

awk '{if(length($0)>max)max=length($0)}
FNR==NR{s1[FNR]=$0;next}{s2[FNR]=$0}
END { format = "%-" max "s\t%-" max "s\n";
  numlines=(NR-FNR)>FNR?NR-FNR:FNR;
  for (i=1; i<=numlines; i++) { printf format, s1[i]?s1[i]:"", s2[i]?s2[i]:"" }
}' file1 file2

And here is the fully documented version of the above.

# 2013-11-05 [email protected]
# Invoke thus:
#   awk -f this_file file1 file2
# The result is what you asked for and the columns will be
# determined by input file order.
#----------------------------------------------------------
# No matter which file we're reading,
# keep track of max line length for use
# in the printf format.
#
{ if ( length($0) > max ) max=length($0) }

# FNR is record number in current file
# NR is record number over all
# while they are equal, we're reading the first file
#   and we load the strings into array "s1"
#   and then go to the "next" line in the file we're reading.
FNR==NR { s1[FNR]=$0; next }

# and when they aren't, we're reading the
#   second file and we put the strings into
#   array s2
{s2[FNR]=$0}

# At the end, after all lines from both files have
# been read,
END {
  # use the max line length to create a printf format
  # the right widths
  format = "%-" max "s\t%-" max "s\n"
  # and figure the number of array elements we need
  # to cycle through in a for loop.
  numlines=(NR-FNR)>FNR?NR-FNR:FNR;
  for (i=1; i<=numlines; i++) {
     printf format, s1[i]?s1[i]:"", s2[i]?s2[i]:""
  }
}

+1 this is the only answer that does work with arbitrary input (i.e. with lines that may contain tabs). I don't think this could be significantly refined/improved. — don_crissti
– don_crissti, Commented Feb 15, 2017 at 21:21

ninjalj · Accepted Answer · 2013-11-06 17:06:58Z

On Debian and derivatives, column has a -n nomerge option that allows column to do the right thing with empty fields. Internally, column uses the wcstok(wcs, delim, ptr) function, which splits a wide character string into tokens delimited by the wide characters in the delim argument.

wcstok starts by skipping wide characters in delim, before recognizing the token. The -n option uses an algorythm that doesn't skip initial wide-characters in delim.

Unfortunately, this isn't very portable: -n is Debian-specific, and column is not in POSIX, it's apparently a BSD thing.

unxnut · Accepted Answer · 2013-11-05 14:21:32Z

2

Not a very good solution but I was able to do it using

paste file1 file2 | sed 's/^TAB/&&/'

where TAB is replaced with the tab character.

answered Nov 5, 2013 at 14:21

unxnut

6,1242 gold badges22 silver badges28 bronze badges

What is the role of && in the sed command?

Vombat
– Vombat

2013-11-05 14:57:11 +00:00
Commented Nov 5, 2013 at 14:57
2

A single & puts what is being searched for (a tab in this case). This command simply replaces the tab at the beginning with two tabs.

unxnut
– unxnut

2013-11-05 15:59:06 +00:00
Commented Nov 5, 2013 at 15:59
I had to change TAB to \t to make this work in zsh on Ubuntu debian. And it does only work if file1 has less than 15 chars

rubo77
– rubo77

2013-11-30 06:53:09 +00:00
Commented Nov 30, 2013 at 6:53

Add a comment |

Jeff Taylor · Accepted Answer · 2017-02-15 20:54:18Z

2

Taking out the dots that you used for padding:

file1:

ETIAM
SED
MAECENAS
DONEC
SUSPENDISSE

file2:

Lorem
Proin
Nunc
Quisque
Aenean
Nam
Vivamus
Curabitur
Nullam

Try this:

$ ( echo ".TS"; echo "l l."; paste file1 file2; echo ".TE" ) | tbl | nroff | more

And you will get:

ETIAM         Lorem
SED           Proin
MAECENAS      Nunc
DONEC         Quisque
SUSPENDISSE   Aenean
              Nam
              Vivamus
              Curabitur
              Nullam

edited Feb 15, 2017 at 20:54

answered Feb 15, 2017 at 20:47

Jeff Taylor

1213 bronze badges

This, like the other solutions using paste will fail to print the proper output if there are any lines containing tabs. +1 for being different though

don_crissti
– don_crissti

2017-02-15 21:12:08 +00:00
Commented Feb 15, 2017 at 21:12
+1. Would you please explain how the solution works?

Tulains Córdova
– Tulains Córdova

2017-02-15 22:45:08 +00:00
Commented Feb 15, 2017 at 22:45

Add a comment |

rubo77 · Accepted Answer · 2013-12-02 18:30:15Z

1

An awk solution that should be fairly portable, and should work for an arbitrary number of input files:

# Invoke thus:
#   awk -F\\t -f this_file file1 file2

# every time we read a new file, FNR goes to 1

FNR==1 {
    curfile++                       # current file
}

# read all files and save all the info we'll need
{
    column[curfile,FNR]=$0          # save current line
    nlines[curfile]++               # number of lines in current file
    if (length > len[curfile])
            len[curfile] = length   # max line length in current file
}

# finally, show the lines from all files side by side, as a table
END {
    # iterate through lines until there are no more lines in any file
    for (line = 1; !end; line++) {
            $0 = _
            end = 1

            # iterate through all files, we cannot use
            #   for (file in nlines) because arrays are unordered
            for (file=1; file <= curfile; file++) {
                    # columnate corresponding line from each file
                    $0 = $0 sprintf("%*s" FS, len[file], column[file,line])
                    # at least some file had a corresponding line
                    if (nlines[file] >= line)
                            end = 0
            }

            # don't print a trailing empty line
            if (!end)
                    print
    }
}

edited Dec 2, 2013 at 18:30

rubo77

30.6k46 gold badges141 silver badges218 bronze badges

answered Nov 6, 2013 at 19:49

ninjalj

1,47012 silver badges11 bronze badges

How do you use this on file1 and file2? I called the script paste-awk and tried paste file1 file2|paste-awk and I tried awk paste-awk file1 file2 but none worked.

rubo77
– rubo77

2013-11-30 07:04:05 +00:00
Commented Nov 30, 2013 at 7:04
I get awk: Line:1: (FILENAME=file1 FNR=1) Fatal: Division by zero

rubo77
– rubo77

2013-11-30 07:04:57 +00:00
Commented Nov 30, 2013 at 7:04
@rubo77: awk -f paste-awk file1 file2 should work, at least for GNU awk and mawk.

ninjalj
– ninjalj

2013-12-02 10:32:14 +00:00
Commented Dec 2, 2013 at 10:32
This works, although it is slightly different from paste there is less space between the two rows. And if the input file has not all rows same length, it will result in an align-right row

rubo77
– rubo77

2013-12-02 14:14:10 +00:00
Commented Dec 2, 2013 at 14:14
@rubo77: the field separator can be set with -F\\t

ninjalj
– ninjalj

2013-12-02 15:30:26 +00:00
Commented Dec 2, 2013 at 15:30

Add a comment |

canupseq · Accepted Answer · 2025-11-22 17:17:55Z

1

You can use pr command instead:

$ pr -mtT file1 file2
ETIAM......                         Lorem....
SED........                         Proin....
MAECENAS...                         Nunc.....
DONEC......                         Quisque..
SUSPENDISSE                         Aenean...
                                    Nam......
                                    Vivamus..
                                    Curabitur
                                    Nullam...

-m is for merging and -t and -T suppress the header. Check the man pages for pr to see all option.

answered Nov 22 at 17:17

canupseq

2,0041 gold badge6 silver badges21 bronze badges

Beware it truncates lines wider than 34 columns (with the default page width of 72 columns)

Stéphane Chazelas
– Stéphane Chazelas

2025-11-24 20:46:56 +00:00
Commented Nov 24 at 20:46

Add a comment |

Ed Morton · Accepted Answer · 2025-11-24 20:27:10Z

1

$ paste file1 file2 | column -s$'\t' -t
ETIAM......  Lorem....
SED........  Proin....
MAECENAS...  Nunc.....
DONEC......  Quisque..
SUSPENDISSE  Aenean...
             Nam......
             Vivamus..
             Curabitur
             Nullam...

or if your input might contain tabs then using any POSIX awk:

$ awk '
    NR==FNR { n=length(); wid=(n>wid?n:wid); vals[NR]=$0; next }
    { printf "%*s %s\n", wid, vals[FNR], $0 }
' file1 file2
ETIAM...... Lorem....
SED........ Proin....
MAECENAS... Nunc.....
DONEC...... Quisque..
SUSPENDISSE Aenean...
            Nam......
            Vivamus..
            Curabitur
            Nullam...

edited Nov 24 at 20:27

answered Nov 24 at 18:30

Ed Morton

36k6 gold badges25 silver badges60 bronze badges

Beware the awk one assumes all characters are single-width (and with some awk implementations single-byte).

Stéphane Chazelas
– Stéphane Chazelas

2025-11-24 20:23:53 +00:00
Commented Nov 24 at 20:23
busybox awk doesn't support %*s in the build of busybox that comes with Debian here.

Stéphane Chazelas
– Stéphane Chazelas

2025-11-24 20:25:38 +00:00
Commented Nov 24 at 20:25
@StéphaneChazelas thanks for the heads up.

Ed Morton
– Ed Morton

2025-11-24 20:28:44 +00:00
Commented Nov 24 at 20:28
Doing printf "%"wid"s %s\n", vals[FNR], $0 would make it work in busybox awk.

Stéphane Chazelas
– Stéphane Chazelas

2025-11-24 20:30:53 +00:00
Commented Nov 24 at 20:30
Beware it prints nothing if file1 is empty. I've personally given up on using that unreliable NR==FNR trick and use awk '!file1_processed {...; next}; ...' file1 file1_processed=1 file2.

Stéphane Chazelas
– Stéphane Chazelas

2025-11-24 20:33:39 +00:00
Commented Nov 24 at 20:33

Add a comment |

Stéphane Chazelas · Accepted Answer · 2025-11-24 20:28:30Z

zsh's print builtin has a -C option to print arguments formatted in Columns.

$ f1=(${(f)"$(<file1)"}) f2=(${(f)"$(<file2)"})
$ f1[$#f2]+= f2[$#f1]+=
$ print -rC2 -- "$f1[@]" "$f2[@]"
ETIAM......  Lorem....
SED........  Proin....
MAECENAS...  Nunc.....
DONEC......  Quisque..
SUSPENDISSE  Aenean...
             Nam......
             Vivamus..
             Curabitur
             Nullam...

$(<file) like in ksh expands to the contents of file without the trailing newline characters.
the f parameter expansion flag, short for ps[\n] splits expansions (here applied to the above) on linefeeds.
With f1[$#f2]+= f2[$#f1]+= we ensure the two arrays are of the same size, by appending nothing to their n^th field, where n is the size of the other array, and in the process create them and extend the size of the array accordingly.
print -rC2 -- "$f1[@]" "$f2[@]" prints those raw on 2 Columns.

Stack Exchange Network

A better paste command

9 Answers 9

You must log in to answer this question.

Hot Network Questions

A better paste command

9 Answers 9

You must log in to answer this question.

Related

Hot Network Questions