Description

Infer the species tree from unrooted gene trees using MDC criterion. The input gene trees must be specified in the Rich Newick Format. Gene trees must be unrooted. The generated output trees will also be generated in the rich newick format.

Usage

Infer_ST_MDC_UR (gene_tree_ident1 [, gene_tree_ident2...]) [-e proportion] [-x] [-b threshold] [-a taxa map] [-ur] [-t time] [result output file]

gene_tree_ident1 [, gene_tree_ident2...]

Comma delimited list of gene tree identifiers. See details.

mandatory

-e proportion

Get optimal and sub-optimal trees.

optional

-x

Use all clusters in generation.

optional

-b threshold

Specifies bootstrap threshold. Edges in the gene trees that have support lower than threshold will be contracted.

optional

-a taxa map

Gene tree / species tree taxa association.

optional

-ur

Allow non-binary species tree generation.

optional

-t time

Limit search time to time minutes.

optional

result output file

Optional file destination for command output.

optional

By default, the method returns the optimal tree. But the option -e allows the users to get the optimal tree and a set of sub-optimal trees. If the optimal tree has n extra lineages, all the sub-optimal trees that have extra lineages less than (1+proportion/100)*n will be returned with the optimal tree.

By default, the method uses clusters induced from gene trees to infer species tree. However, the option -x allows users to specify using all possible clusters to infer species tree.

If input gene trees have bootstrap values a threshold can be set with the -b option.

By default, the method will always return a binary species tree. But users can use option -ur to allow non-binary species tree. If the gene trees are not binary and the degree of resolution are low, it is recommended to use this option. Otherwise, the program will do some exhaustive search for a binary species tree. In this case, users can also use option -t to limit the search time. The time is in the unit of minutes.

By default, it is assumed that only one individual is sampled per species in gene trees. However, the option -a allows multiple alleles to be sampled.

Examples

#NEXUS

BEGIN NETWORKS;

Network g1 = ((((a:5,b:5):4,c:9):3,d:12):3,e:15);
Network g2 = ((a:6,b:6):11,((c:12,e:12):2,d:14):3);
Network g3 = ((a:8,c:8):7,((b:14,e:14):1,d:15));

END;


BEGIN PHYLONET;

Infer_ST_MDC_UR (g1, g2, g3);

END
#NEXUS

BEGIN NETWORKS;

Network g1 = ((((a1::.5,b1::.5)::.5,c::.5)::.5,d::.5)::.5,e::.5)::.5;
Network g2 = ((a2::.5,b2::.5)::.5,((c::.5,e::.5)::.5,d::.5)::.5)::.5;
Network g3 = ((a::.5,c::.5)::.5,((b::.5,e::.5)::.5,d::.5)::.5)::.5;

END;


BEGIN PHYLONET;

InferST_MDC_UR (g1, g2, g3) -b .5 -e .2 -x -ur -t 1 -a <z:a1,a2,a; y:b1,b2,b; c:c; d:d; e:e>;

END;

Command Refernces

  • Y. Yu, T. Warnow, and L. Nakhleh. Algorithms for MDC-based multi-locus phylogeny inference. The 15th Annual International Conference on Research in Computational Molecular Biology (RECOMB), pages 531--545, 2011. LNBI 6577.

See Also

  • No labels