Next: , Previous: , Up: Descriptive   [Contents] Site frequency spectrum

The SiteFrequencySpectrum computes the site frequency spectrum for each block. Only positions in the alignment with only two states are considered. The proportions of bi-allelic sites are then computed by bins. Lets consider the following example with 7 sequences:


It contains one site with 4/7 (eq 3/7), one site with 2/7 (eq 5/7), one site with 1/7 (eq 6/7) and one site 0/7 (eq 7/7). With seven sequences, there are actually 4 possibles frequencies: 1/7, 2/7, 3/7 and 4/7 (plus the 0/7 frequency, corresponding to a site with no-mutation). It is possible to compute all these frequencies individually by settingthe bounds parameter to

SiteFrequencySpectrum(bounds=(-0.5,0.5,1.5,2.5,3.5,4.5), ...)

which will output the number of site with 0, 1, 2, 3 or 4 minor states. Sites with more than 2 states are always counted separately, as well as sites containing unresolved characters. If one want to count only constant sites for instance, one can simply type

SiteFrequencySpectrum(bounds=(-0.5,0.5), ...)

The remaining sites will be pulled in a column called “Ignored”.


maf.filter=                                 \
    SequenceStatistics(                     \
        statistics=(\                       \
            SiteFrequencySpectrum(          \
                bounds=(-0.5, 0.5, 1.5),    \
                ingroup=(pop1, pop2, pop3), \
                outgroup=species2,          \
            [...]),                         \
        ref_species=pop1,                   \
        file=data.statistics.csv),          \


bounds={list of double}

The bounds delimiting the bins of the spectrum to be computed.


A list of species forming the ingroup and on which the statistics should be calculated.


A species name saying which sequence should be used to determine the ancestral state. If non-empty, the unfolded frequency spectrum will be computed. If empty, the folded frequency spectrum will be returned (1/5 and 4/5 frequences will be pulled, so that the maximum frequency will be 1/2).