The basic Auto-Pilot work flow is essentially that Auto-pilot produces
two files for each experiment (as defined by a TEST
directive in
your .ap file). The first file is a log of stdout and stderr,
with the suffix .log. No processing is done on this file—it is just
there so you can take a look at what happened (for example, if something broke
or you can't explain your results). The second file is more interesting—it is
the results file which has the suffix .res. The results file contains
information that the benchmarker deems worthy of automatic extraction (the
default Auto-pilot scripts record time and optionally several other
quantities, see Included Script Plugins).
The results files are made up of blocks, which start with
[blockname], and then contain simple text (for example, the
system snapshot contains the output of various commands, without any
special formatting). Each command that is measured with the
ap_measure
function creates a measurement block, which looks like
the following:
[measurement] thread = 2 epoch = 2 command = postmark /tmp/postmark_config-9868 user = 0.200000 sys = 1.170000 elapsed = 5.470682 status = 0
Using a measure hook, arbitrary fields can be added to this measurement. The Auto-Pilot distribution has two measure hooks distributed by default. The first keeps track of the number of SCSI commands queued on Adaptec SCSI cards. The second determines the amount of CPU time used by all processes in the system, which can be compared with the amount of CPU time your benchmark used. These two hooks have proven useful to investigate possible anomalies. These hooks in @pkgdatadir@/commonsettings.d can be used as samples for creating your own measurement hooks. By default @pkgdatadir@ is /usr/local/share/auto-pilot.
After you have these results files, you can pass them through the Getstats program. Getstats is an automated and powerful way to transform the results files into nicely formatted tables, and to compare two different results files.
If you just want to know how to use Getstats and not how it works then read only this Chapter. If you want to understand the internals, and perform more complex transformations, then you should read Getstats Internals.
Getstats starts off by processing its command line. The command line consists of options and transformations, followed by a list of files to read.
Getstats takes each transformation that is specified on the command line and
pushes it onto a stack (@TRANSFORMS
) for later use.
Each file that is specified on the command line is parsed into a two dimensional array. You can specify either a CSV file, an Auto-pilot results file, or a sequence of GNU time output. Getstats automatically determines the right file type and parse the file. Getstats Parsers describes how to add your own parser. The two dimensional array is a relation that Getstats then manipulates. The relation consists of labels (or names) for each field, and then rows (or tuples) with a value for each field.
Each individual transformation on the @TRANSFORMS
stack is done to each
of the relations, in turn. If two transformations are specified A and B; and
there are two files R and S, then A is applied to R, A is applied to S, B is
applied to R, and finally B is applied to S.
This section describes how to produce tabular reports, how to convert a results file into a CSV file, and how to do simple hypothesis testing with Getstats.