bash: functions containing part of AWK code

Question

within of my bash code I have a part of an sed + AWK code, which do interatively some operation on input file and add the results to another txt file (the both filles had been created by the same bash script, and can be stored as different variables).

#sed removing some lines in input fille "${file}".xvg, defined in the begining of bash script
sed -i '' -e '/^[#@]/d' "${file}".xvg
# AWK measure XMAX and YMAX in the input file
# adding these outputs as two lines in another batch.bfile, which is going to be used for something
awk '
  NR==1{
    max1=$1
    max2=$2
  }
  $1>max1{max1=$1}
  $2>max2{max2=$2}
  END{printf "WORLD XMAX %s\nWORLD YMAX %s\n",max1+0.5,max2+0.5'} "${file}".xvg >> "${tmp}"/batch.bfile

is it possible to combine the both (sed +awk ) actions into some function (defined in the beggining of my bash script) and then use it as one line command within the script (in more sophisticated cas it will be applied on many filles within FOR loop)?

Here is example of my version:

#!/bin/bash


#folder with batch file
home=$PWD
tmp="${home}"/tmp


## define some functions for file processing
bar_xvg_proc () {
##AWK procession of XVG file: only for bar plot;
sed -i '' -e '/^[#@]/d' ${file}
# check XMAX and YMAX for each XVG
awk '
  NR==1{
    max1=$1
    max2=$2
  }
  $1>max1{max1=$1}
  $2>max2{max2=$2}
  END{printf "WORLD XMAX %s\nWORLD YMAX %s\n",max1+0.5,max2+0.5'} ${file} >> "${tmp}"/grace2.bfile

}
###

bar_xvg_proc "${home}"/test.xvg

and here is an error from sed

sed: -i may not be used with stdin

BUT if I define my test.xvg as a new variable $file="${home}"/test.xvg before calling my function in the script - it works well. How I could use this function directly with input file (w/o the specific variable assigned to the file )?

Here is my xvg file:

# Created by:
#                     :-) GROMACS - gmx cluster, 2019.3 (-:
# 
# Executable:   /usr/local/bin/../Cellar/gromacs/2019.3/bin/gmx
# Data prefix:  /usr/local/bin/../Cellar/gromacs/2019.3
# Working dir:  /Users/gleb/Desktop/DO/unity_or_separation
# Command line:
# gmx cluster is part of G R O M A C S:
#
# Good gRace! Old Maple Actually Chews Slate
#
@    title "Cluster Sizes"
@    xaxis  label "Cluster #"
@    yaxis  label "# Structures"
@TYPE xy
@g0 type bar
       1       94
       2       31
       3       24
       4       24
       5       15
       6        6
       7        6
       8        5
       9        4
      10        4
      11        3
      12        3
      13        3
      14        3
      15        2
      16        2
      17        2
      18        2
      19        1
      20        1
      21        1
      22        1
      23        1
      24        1
      25        1

It would be easy to combine the two sets of edits. The sticking point is that the sed makes a permanent edit (in-situ) to the .vng file, but the awk makes an edit (and append) to a different file that is evidently temporary. Rearranging that would presumably have unknown side affects on future runs. — Paul_Pedant, Commented Oct 28, 2020 at 11:46
It would appear that your sed call simply serves to remove "comment" lines that start with # or @, so that the subsequent awk call processes a file that only contains numbers. Can you post an example of one such .xvg files? Are you sure the "in-place editing" of the .xvg file via sed to remove such comments is necessary? — AdminBee, Commented Oct 28, 2020 at 13:33
Normally the both XVG and batch filles are temporare in my script. The trick that I am using some other program (inside of for loop of my bash script) to create those XVGs filles, which in turn I have to pre-process (using those combinatoin of sed+ awk and finally then I use those XVGs to plot the graphs using batch file (produced in the same loop) etc. However since there are several repetitions of SED + AWK in my script (when I apply it on several different file types), it seems logical for me to put the both into some external functions with some local variables ... — user3470313, Commented Oct 28, 2020 at 13:34
.. and yes presently the script works correctly with file processing: it actually removes # and @ lines to keep only DATA in the XVG in order that AWK could meassure XMAX and YMAX correctly (then I use this info for graph plotting in loop, which require the both dimensions for each XVG). I only would like to optimize this code via introduction of function for repeting blocks ... :-) — user3470313, Commented Oct 28, 2020 at 13:37

Ed Morton · Accepted Answer · 2020-10-28 14:38:33Z

2

Just change ${file} to "$1" inside your function and it'd do what you want.

Also then consider changing this:

bar_xvg_proc () {
##AWK procession of XVG file: only for bar plot;
sed -i '' -e '/^[#@]/d' "$1"
# check XMAX and YMAX for each XVG
awk '
  NR==1{
    max1=$1
    max2=$2
  }
  $1>max1{max1=$1}
  $2>max2{max2=$2}
  END{printf "WORLD XMAX %s\nWORLD YMAX %s\n",max1+0.5,max2+0.5'} "$1" >> "${tmp}"/grace2.bfile

}

to this:

bar_xvg_proc () {
    ##AWK procession of XVG file: only for bar plot;
    # check XMAX and YMAX for each XVG
    awk '
      /^[#@]/ { next }
      (++nr)==1{
        max1=$1
        max2=$2
      }
      $1>max1{max1=$1}
      $2>max2{max2=$2}
      END{printf "WORLD XMAX %s\nWORLD YMAX %s\n",max1+0.5,max2+0.5'} "${@:--}" >> "${tmp}"/grace2.bfile
}

You never need sed when you're using awk and using "${@:--}" that way lets you have a function that will work whether you pass multiple file names to it or pipe a stream to it as it's telling awk to use stdin if no file is present.

Idk if you should really be using >> instead of > at the end of that, and you might want to do the output redirection outside of the function.

edited Oct 28, 2020 at 14:38

answered Oct 28, 2020 at 14:27

Ed Morton

34.7k6 gold badges24 silver badges55 bronze badges

thank you very much! your script works very well, with the exemption that I did not understand what is "${@:--}" and how it different from "$1" in the function (the both works correctly in my case, and the function correctly accept the input filles. And yes I use >> in AWK just adding two lines each time into the "${tmp}"/grace2.bfile
– user3470313
Commented Oct 28, 2020 at 14:52
Create these 2 functions: foo() { awk '{print FILENAME, $0}' "${@:--}"; } and bar() { awk '{print FILENAME, $0}' "$1"; }. Now create 2 input files to test with seq 2 > file1; seq 3 > file2. Now try foo file1 file2 and echo 7 | foo and then do the same for bar. See the difference? Now re-read the paragraph in my answer where I explain that difference and then let me know if you still have a question.
– Ed Morton
Commented Oct 28, 2020 at 15:12
1

Alright, thank you again! Cheers,
– user3470313
Commented Oct 28, 2020 at 15:13

Add a comment |

Stack Exchange Network

bash: functions containing part of AWK code

1 Answer 1

You must log in to answer this question.

Hot Network Questions

bash: functions containing part of AWK code

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions