Description
- Co-estimation of reticulate phylogenies (ILS & hybridization) and gene trees on sequences from multiple independent loci.
- We use BEAGLE, a high-performance library to calculate the "Felsenstein Likelihood", full details of installation instructions can be found here.
Usage
MCMC_SEQ -loci locusList [-cl chainLength] [-bl burnInLength] [-sf sampleFrequency] [-sd seed] [-pl parallelThreads] [-outDir OutDirectory] [-mc3 temperatureList] [-mr maxReticulation] [-tm taxonMap] [-fixps popSize] [-varyps] [-pp poissonParameter] [-dd] [-ee] [-gtr paramList] |
MCMC Settings | ||
-loci locusList | The list of loci used in the inference. For example, a list (YNR008W,YNL313C) indicates the inference is performed on two loci YNR008W and YNL313C. See the format of multilocus data here. | optional |
-cl chainLength | The length of the MCMC chain. The default value is 1,000,000. | optional |
-bl burnInLength | The number of iterations in burn-in period. The default value is 200,000. | optional |
-sf sampleFrequency | The sample frequency. The default value is 5,000. | optional |
-sd seed | The random seed. The default seed is 12345678. | optional |
-pl parallelThreads | The number of threads running in parallel. The default value is the number of threads in your machine. | optional |
-outDir outDirectory | The absolute path to store the output files. The default value is the home directory. | |
MC3 Settings | ||
-mc3 temperatureList | The list of temperatures for the Metropolis-coupled MCMC chains. For example, a list (2.0, 3.0) indicates two hot chains with temperatures 2.0 and 3.0 respectively will be run along with the cold chain with temperature 1.0. By default only the cold chain will be run. | optional |
Inference Settings | ||
-mr maxReticulation | The maximum number of reticulation nodes in the sampled phylogenetic networks. The default value is 4. | optional |
-tm taxonMap | Gene tree / species tree taxa association. By default, it is assumed that only one individual is sampled per species in gene trees. However, this option allows multiple alleles to be sampled. | optional |
-fixps popSize | Fix the population sizes associated with all branches of the phylogenetic network to this given value. By default, we estimate a constant population size across all branches. | optional |
-varyps | Vary the population sizes across all branches. By default, we estimate a constant population size across all branches. | optional |
Prior Settings | ||
-pp poissonParam | The Poisson parameter in the prior on the number of reticulation nodes. The default value is 1.0 | optional |
-dd | Disable the prior on the diameters of hybridizations. By default this prior on is exp(10). | optional |
-ee | Enable the Exponential(10) prior on the divergence times of nodes in the phylogenetic network. By default we use Uniform prior. | optional |
Substitution Model | ||
-gtr paramList | Set GTR (general time-reversible) as the substitution model. The first four parameters in the list represent base frequencies for A, C, G, T. The rest six parameters represent transition probabilities for A>C, A>G, A>T, C>G, C>T and G>T. The default substitution model is JC69 model. | optional |
Understanding the Output
System Output
- Logger: each time a sample is collected, the program prints out the Posterior value, current ESS (Effective Sample Size) based on the posterior values, likelihood value, prior value, current ESS based on the prior values, and the sampled phylogenetic network sampled.
- Summarization: the program prints out the chain length, burn-in length, sample frequency and the overall acceptance rate of proposals.
- Operations: the usage and the acceptance rate for each operation.
- Topologies: the MAP (Maximum A Posterior) topology is given. For each unique topology, the network with the maximum posterior value and the averaged (branch lengths and inheritance probabilities) network are printed out. The topologies are ranked on their posterior probabilities.
- Run time: the elapsed time.
----------------------- Logger ----------------------- Iteration; Posterior; ESS; Likelihood; Prior; ESS; #Reticulation 0; -2736.29030; 0.00000; -2732.29030; -4.00000; 0.00000; 0; (TaB:1.0,(TaA:1.0,TaD:1.0):1.0); 1; -2466.11338; 0.00000; -2461.94926; -4.16411; 0.00000; 0; (TaB:0.3961637257713907,(TaA:1.9334838641021195,TaD:1.6038362742286094):0.23062827100884262); ... 18; -2441.04154; 2.93524; -2430.03388; -11.00767; 2.73299; 1; (((TaD:0.09597337108498416,(TaB:0.23136218040818857)I2#H1:0.3566653185047678::0.3491095120667689)I3:0.868706329833222,TaA:1.9312992553517092)I1:0.6058432353787028,I2#H1:0.6744658622739422::0.6508904879332311)I0; 19; -2438.81459; 3.12108; -2428.96331; -9.85129; 3.06350; 1; (((TaD:0.9712042239200506)I3#H1:0.11593455786736508::0.3709418852580564,TaA:1.665769657982743)I1:0.9242235576727479,(TaB:0.5027340616085754,I3#H1:0.12410067880026514::0.6290581147419436)I2:0.32269541431429016)I0; 20; -2442.52537; 3.71247; -2430.97186; -11.55351; 3.29413; 1; ((TaA:0.407024316728807,(TaD:0.08498134125891532)I3#H1:0.1385571192167834::0.3614459570126758)I1:1.8568299946014093,(TaB:0.42061630776567865,I3#H1:1.0672749770829797::0.6385540429873242)I2:0.3889481921313129)I0; ----------------------- Summarization ----------------------- --------------- Topologies --------------- Overall MAP = -2437.0654296232783 ((TaB:0.561805846080817)I2#H1:0.46079502961747665::0.5721872707642428,(TaA:0.8950782820984362,(I2#H1:0.3411560735694928::0.4278127292357572,TaD:0.006727870085083132)I3:0.5352165283781162)I1:0.5975616454139091)I0; Rank = 0; Size = 5; Percent = 0.5; MAP = -2437.1534368461857:(((TaD:0.3698584300172657)I3#H1:0.5289684026069202::0.4412419361674995,TaA:0.7616303047346242)I1:0.7750966068147521,(I3#H1:0.1429345953478124::0.5587580638325005,TaB:0.4229104875916929)I2:0.3596176393163555)I0; Ave=-2438.818599313804; ((TaB:0.4160163591812,(TaD:0.3919469161549564)I3#H1:0.31796822920332624::0.47841193003273014)I2:0.3651396905586732,(TaA:0.8638161723172626,I3#H1:0.4573559657851387::0.5215880699672699)I1:1.1104505168595664)I0; Total elapsed time : 19.42900 s |
Samples
- Phylogenetic Network
- Gene trees
- Population size
Command References
- D.Wen and L. Nakhleh. Co-estimating reticulate phylogenies and gene trees on sequences from multiple independent loci. Submitted.