I just write few lines to grep smallest value in my files and it is giving me correct result but repeating lines two times can you fix the bug
What I am doing:
- Greping all files
- Removing header
- sorting in scientific notation using column nine
- taking the first line that is the smallest after sort and printing using awk
- I want file name so printed $i too
Script:
#!/bin/bash
for i in `ls -v *.txt`
do
smallestPValue=`sed 1d $i | sort -k9 -g | head -1 | awk '{print $0}'`
echo $i $smallestPValue >> smallesttPvalueAll.txt
done
output
U1.text 4 rsxxx 1672175 A ADD 759 0.0751 4.918 1.074e-06
U1.txt 4 rsxxxx 1672175 A ADD 759 0.0751 4.918 1.074e-06
U2.txt 16 rsxxxx 596342 T ADD 734 -0.05458 -5.204 2.535e-07
U2.txt 16 rsxxxx 596342 T ADD 734 -0.05458 -5.204 2.535e-07
U3.txt 2 rsxxxx 12426 T ADD 722 0.06825 5.285 1.669e-07
I am getting repetitions for few lines while some are just fine as U3 above is coming once and that's what I want. I can easily get rid of duplicated lines by uniq or sort -u but just curious what is causing this
Desired output each line repeated once
ls -v *.txt
?smallesttPvalueAll.txt
matches*.txt
so is processed along with all the other .txt files. but there's so many things wrong with the way you're trying to do this that it's not even worth trying to fix. see my answer below for a better method.