Did you mean to do something like this? It's the only way i can think of to make sense of your script.
awk -v OFS=$'\t' '
FNR == 1 { $5 = "sdev" ; print }
FNR > 1 { a = $4 # field 4 is 'avg'
n = NF-1 # exclude the 'avg' field from the ss calculations.
for (i=1; i <= n; i++) { ss += ($i - a)^2 }
$5 = sqrt(ss/n)
print
}' inputfile
Note: $i
on the for
line refers NOT to the value of i
, but to the input field numbered i
- i.e. it loops through $1
, $2
, and $3
. This may not be obvious to shell or perl
users where (scalar) variables are normally prefixed by $
.
NF
is the number of fields on a line, and FNR
is the record (line) number of the current input file (so this awk
script supports multiple input files, each with their own header line. If there's only ever going to be one input file at a time, you could use NR
instead of FNR
).
Sample output:
1 2 3 avg sdev
23.3107 20.0372 21.7236 21.6905 1.33661
Here's another version which works with any number of fields per line. It assumes that the last field of a line contains the average of all the previous fields on that line.
$NF
refers to the value of the last field (i.e. the 'avg') and $new
refers to the (last field + 1), i.e. assigning a value to it adds a new field to the end of the line.
awk -v OFS=$'\t' '
FNR == 1 { new = NF+1 # number of new field to add
$new = "sdev"
print
}
FNR > 1 { a = $NF # last field is 'avg'
n = NF-1 # exclude the 'avg' field from the ss calculations.
for (i=1; i <= n; i++) { ss += ($i - a)^2 }
$new = sqrt(ss/n)
print
}' inputfile
Sample ouput with 5 values plus an average on each input line:
1 2 3 4 5 avg sdev
23.3107 20.0372 21.7236 20.5328 21.2016 21.3611 1.13107
file
command to get standard deviations, which is unlikely.a=$5
when there are only 4 fields - and, worse, doing it in the END block where there aren't any input fields anyway (because you've already processed them in the main block). Why are you trying to process the array of input lines in mathematical functions? those array elements don't contain single floating-point numbers, they are strings containing entire lines (with all 4 fields). Finallyn
isn't defined anywhere, soss/n
is always going to be a division by 0 error.