Next: , Previous: , Up: Analyzing   [Contents]

2.3.1 Descriptive statistics

The section describes how to compute simple statistics for each alignment block. For the sake of computer efficiency, several statistics can be computed simultaneously on each alignment block. The results are written in a text file, with one line per block and one column per statistics. In addition, the coordinates of the block will be reported according to a specified reference species. The choice of statistics is specified by the user. Some of them output several results, which will appear each in a column in the output file.

Descriptive statistics are computed through the SequenceStatistics filter, which takes the following arguments:

Synopsis:

maf.filter=                                 \
    [...],
    SequenceStatistics(                     \
        statistics=(\                       \
            BlockLength,                    \
            AlnScore,                       \
            BlockCounts),                   \
        ref_species=species1,               \
        file=data.statistics.csv,           \
        compression=none),                  \
    [...]

Arguments:

statistics={list of statistics functions}

See below for the list of possible functions and their detailed description.

ref_species={string}

The species to use to report block coordinates in the output file. For block where the reference species is missing, NA will be output.

file={path}

A file path for the output file.

compression={none|gzip|zip|bzip2}

Compression format for output file.

The statistics to compute take the form of functions (just like filters themselves), which can potentially take arguments. Here is the list of currently available statistical functions: