stream statistics utility
stats.awk is a simple, one-pass streaming statistics utility, implemented in AWK.
stats.awk reads numeric values from standard input or files and computes
basic descriptive statistics (mean, variance, standard deviation, min/max)
and optionally 1-pass linear regression y = a + b x.
It is intended for use in Unix pipelines on FreeBSD and Linux systems.
All computations are performed in a single pass over the input stream.
When processing single-column input with no xcol or ycol specified,
stats.awk automatically performs linear regression using the record index as
x and the column value as y.
Descriptive statistics are always computed from the y values.
sqrt()sprintf()tolower()sh
make install # Installs script and man page to /usr/local
make uninstall # Removes installed files
You can override the prefix:
sh
make PREFIX=$HOME/.local install
sh
chmod +x stats.awk
cp stats.awk /usr/local/bin/
cp stats.awk.1 /usr/local/man/man1/
sh
awk -f stats.awk [options] [file ...]
Example:
Count: 1400
Mean: 1.1883143
Sum: 1663.6401
Unbiased Variance: 0.002564187
Unbiased StdDev: 0.050637802
Peak (Max): 1.353943
2nd Peak: 1.353943
Neg Peak (Min): 0.996242
2nd Neg Peak: 1.000321
Regression a: 1.2674196
Regression b: -0.00011292684
JSON FORMAT
Example:
{
"count": 1400,
"mean": 1.1883143,
"sum": 1663.6401,
"var": 0.002564187,
"std": 0.050637802,
"max": 1.353943,
"max2": 1.353943,
"min": 0.996242,
"min2": 1.000321,
"regression": {
"a": 1.2674196,
"b": -0.00011292684
}
}
sh
seq 1 100 | stats.awk
CSV output
sh
stats.awk -v out=csv data.txt
12-digits precision scientific notation
sh
stats.awk -v prec=12 -v fmt=sci data.txt
Linear regression on 1st and 2nd columns
sh
stats.awk -v xcol=1 -v ycol=2 data.txt
JSON output for downstream processing
sh
stats.awk -v out=json data.txt | jq .
Download stats.awk-1.20.tar.gz— stats.awk script and manual page.
The archive contains the files 'stats.awk.sha256' and 'stats.awk.md5', which contain the sha265 and md5 hash values for stats.awk respectively.
SHA256 (stats.awk) = 3dc37347c677aac25989dd7ee2f1af3fdade71a5c21a5fbe68ff3d3ad283d576
MD5 (stats.awk) = 110a558e7ca41edda3e0052bb98ae6ea
Users can verify the distributed stats.awk using:
sh
sha256 stats.awk
md5 stats.awk



© 2000 Takayuki HOSODA.