Join multiple sed commands in one script for processing CSV file

Question

Having a CSV file like this:

HEADER
"first, column"|"second "some random quotes" column"|"third ol' column"
FOOTER

and looking for result like:

HEADER
first, column|second "some random quotes" column|third ol' column

in other words removing FOOTER, quotes in beginning, end and around |.

So far this code works:

sed '/FOOTER/d' csv > csv1 | #remove FOOTER
sed 's/^\"//' csv1 > csv2 | #remove quote at the beginning
sed 's/\"$//' csv2 > csv3 | #remove quote at the end
sed 's/\"|\"/|/g' csv3 > csv4 #remove quotes around pipe

As you see the problem is it creates 4 extra files.

Here is another solution, that has a goal not to create extra files and to do the same thing in a single script. It doesn't work very well.

#!/bin/ksh

sed '/begin/, /end/ { 
        /FOOTER/d
        s/^\"//
        s/\"$//
        s/\"|\"/|/g 
}' csv > csv4

Since you are having quotes you can have newlines in the fields. your sed is not going to work with that, only with simplified csv. Use a programming language with a library that can handle real CSV files (Python/Perl/Ruby). — Anthon
– Anthon, Commented Sep 12, 2015 at 12:53

terdon · Accepted Answer · 2015-09-12 12:59:01Z

First of all, as Michael showed, you can just combine all of these into a single command:

sed '/^FOOTER/d; s/^\"//; s/\"$//; s/\"|\"/|/g' csv > csv1

I think some sed implementations can't cope with that and might need:

  sed -e '/^FOOTER/d' -e 's/^\"//' -e 's/\"$//' -e 's/\"|\"/|/g' csv > csv1

That said, it looks like your fields are defined by | and you just want to remove " around the entire field, leaving those that are within the field. In that case, you could do:

$ sed '/FOOTER/d; s/\(^\||\)"/\1/g; s/"\($\||\)/\1/g' csv 
HEADER
first, column|second "some random quotes" column|third ol' column

Or, with GNU sed:

sed -r '/FOOTER/d; s/(^|\|)"/\1/g; s/"($|\|)/\1/g' csv

You could also use Perl:

$ perl -F"|" -lane 'next if /FOOTER/; s/^"|"$// for @F; print @F' csv 
HEADER
first, column|second some random quotes column|third ol' column

Michael Durrant · Accepted Answer · 2015-09-12 15:44:50Z

20

This would also work:

sed 's/^"//; s/"|"/|/g; s/""$/"/'

Example:

$ echo '"this"|" and "ths""|" and "|" this 2"|" also "this", "thi", "and th""' | 
sed 's/^"//; s/"|"/|/g; s/""$/"/'
this| and "ths"| and | this 2| also "this", "thi", "and th"

pretty version

sed '
s/^"//
s/"|"/|/g
s/""$/"/
$d
'

edited Sep 12, 2015 at 15:44

answered Sep 12, 2015 at 14:00

Michael Durrant

43.8k73 gold badges176 silver badges238 bronze badges

1

This doesn't deal with the footer.

terdon
– terdon ♦

2015-09-12 14:57:05 +00:00
Commented Sep 12, 2015 at 14:57
3

But that will remove the last line no matter what its contents. If there is no FOOTER, it will remove wanted data.

terdon
– terdon ♦

2015-09-12 15:46:44 +00:00
Commented Sep 12, 2015 at 15:46

Add a comment |

Paulo Tomé · Accepted Answer · 2020-03-06 17:05:56Z

1

The sed command that worked for me is:

sed 's/ALA/A/g;s/CYS/C/g;s/ASP/D/g;s/GLU/E/g;s/PHE/F/g;s/GLY/G/g;s/HIS/H/g;s/HID/H/g;s/HIE/H/g;s/ILE/I/g;s/LYS/K/g;s/LEU/L/g;s/MET/M/g;s/ASN/N/g;s/PRO/P/g;s/GLN/Q/g;s/ARG/R/g;s/SER/S/g;s/THR/T/g;s/VAL/V/g;s/TRP/W/g;s/TYR/Y/g;s/MSE/X/g;s/ //g'  < old.txt > new.fasta

The sed commands cannot be piped. It has to be given as a single command.

edited Mar 6, 2020 at 17:05

Paulo Tomé

3,8626 gold badges29 silver badges40 bronze badges

answered Mar 6, 2020 at 16:39

Angana

112 bronze badges

Add a comment |

Stack Exchange Network

Join multiple sed commands in one script for processing CSV file

3 Answers 3

sed 's/^"//; s/"|"/|/g; s/""$/"/'

You must log in to answer this question.

Hot Network Questions

Join multiple sed commands in one script for processing CSV file

3 Answers 3

sed 's/^"//; s/"|"/|/g; s/""$/"/'

You must log in to answer this question.

Related

Hot Network Questions