Total a Column with addup

addup Some programs output information in columns. The addup script totals the numbers in a column. It reads from files or standard input. For example, the lastcomm command shows CPU time used in column 4, like this:


% lastcomm tcomm sleep tcomm __ 0.08 secs Thu Mar 27 10:23 date tcomm __ 0.08 secs Thu Mar 27 10:23 tail tcomm __ 0.09 secs Thu Mar 27 10:23 pwho tcomm __ 0.30 secs Thu Mar 27 10:23 % lastcomm tcomm | addup 4 0.550000

grep -c () outputs the number of matches after a colon (:) on each line. To total the matches, pipe grep's output through a little sed command to strip off the filenames and colon; have addup sum the output (the "first column"):

% grep -c CAUTION *.txt abar.txt:0 applic.txt:3 badprob.txt:235 ... % grep -c CAUTION *.txt | sed 's/.*://' | addup 1 317

Here's the script:

case "$1" in [1-9]*) colnum="$1"; shift;; *) echo "Usage: `basename $0` colnum [files]" 1>&2; exit 1;; esac # Use integer output, but switch to %.4f format if "." in input. awk '{sum += $col} END {print sum}' col=$colnum OFMT="%.4f" ${1+"$@"}

The ${1+"$@"} holds filenames (if any) from the command line and works around a shell quoting problem (). The awk script passes in the column through a variable on its command line, then $col becomes the column number. The script defaults to integer output format, without a decimal point. If it sees a "." in the input (like ), it switches to floating-point output format; the OFMT variable setting of %.4f forces awk to always print the result with four digits after the decimal point. (The default OFMT setting, %.6g, prints large numbers in e-notation. If you want that, delete the OFMT="%.4f".)

- JP