Previous: , Up: Descriptive   [Contents]

2.3.1.13 Fit a substitution model and estimate parameters

ModelFit fits a substitution model to the input block, given a phylogenetic tree. All nucleotide homogeneous models can be used, with or without rate across sites variation.

Synopsis:

maf.filter=                                 \
    [...],                                   
    SequenceStatistics(                     \
        statistics=(\                       \
            [...],                                                    
            ModelFit(                       \
                model=HKY85(                \
                     kappa=1,               \
                     theta=0.5,             \
                     theta1=0.5,            \
                     theta2=0.5),           \
                rate_distribution=Gamma(    \
                     n=4,                   \
                     alpha=0.5),            \
                tree=BioNJ,                 \
                parameters_output=(         \
                     HKY85.theta,           \
                     HKY85.theta1,          \
                     HKY85.theta2,          \
                     HKY85.kappa),          \
                fixed_parameters=(),        \
                reestimate_brlen=yes,       \
                max_freq_gaps=0.3,          \
                gaps_as_unresolved=yes)),   \
            [...]),                         \
        ref_species=species1,               \
        file=data.statistics.csv),          \
    [...]

Arguments:

model={string}

Substitution model to use. See the Bio++ Program Suite manual for an extensive description of available models. All nucleotide models can be used.

rate_distribution={string}

The distribution for rates across sites. See the Bio++ Program Suite manual for all available distributions.

root_freq={None|Full|GC}

Allow root frequencies to be different (non-stationary model). Root frequencies can be fully parametrized, or parametrized with GC content.

tree={string|none}

The property name under which trees are stored for each block. If set to “none”, then an input file should be given.

tree.file={path}[tree=none]

Path for tree file, in case no property is set.

tree.format={Newick|Nhx}[tree=none]

Format for tree file, in case no property is set.

parameters_output={list}

A list of parameter names to output as statistics.

fixed_parameters={list}

A list of parameters which should not be optimized, but fixed to their initila values.

reestimate_brlen={boolean}

Tell if the branches of the tree should be reestimated for each block together with other model parameters.

max_freq_gaps={float}

The maximum proportion of gaps for a site to be included in the analysis.

gaps_as_unresolved={yes/no}

Tell if remaining gaps should be converted to ’N’ before likelihood computation. This should be ’yes’ unless you specify a substitution model which explicitely allows for gaps.

global_clock={yes/no}

Assume a global clock for branch lengths.

reparametrize={yes/no}

Transform parameters to remove constraints (can improve optimization, but is usually slower).