Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 

MLE_BiMarkers [-diploid] [-dominant dominantMarker] [-op] [-] [-sd seed] [-pl parallelThreads] [-mr maxReticulation] [-tm taxonMap] [-fixtheta theta] [-varytheta] [-esptheta] [-snet startingNetwork] [-ptheta startingTheta] [-pi0 PI0]

ML Settings
-mnr numRunsThe length of the MCMC chainnumber of iterations of simulated annealing. The default value is 100.

optional

-mec maxExaminationsCount

The maximum allowed times of examining a state during one iteration. The default value is 50,000.optional

-mno numOptimums

The number of optimal networks to print. The default value is 10.

optional

-mf maxFailuresThe maximum allowed times of failures to accept a new state during one iteration. The default value is 50.optional
-pl parallelThreads The number of threads running in parallel. The default value is the number of threads in your machine.optional
MC3 Inference Settings
-mc3 temperatureList

The list of temperatures for the Metropolis-coupled MCMC chains. For example, -mc3 (2.0, 3.0)indicates two hot chains with temperatures 2.0 and 3.0 respectively will be run along with the cold chain with temperature 1.0. By default only the cold chain will be run. Note that

  • The temperatures should be DIFFERENT! For example, -mc3 (2.0, 2.0, 3.0) is invalid.
  • The temperature of the cold chain should NOT be included. For example, -mc3 (1.0, 2.0, 3.0) is incorrect.
  • Metropolis-coupled MCMC leads to faster convergence and better mixing, however, the running time increases linearly with the number of chains. We suggest you first run a standard MCMC chain (cold chain) without this command. If the trace plot indicates the chain is not mixed well (jagged, stuck in local maxima for a long time), then try this command.
optional
Inference Settings
pseudoUse pseudolikelihood.optional
-mr maxReticulationThe maximum number of reticulation nodes in the sampled phylogenetic networks. The default value is 4.optional
-taxa taxaListThe taxa used for inference. For example, -taxa (a,b,c)required
-tm taxonMapGene tree / species tree taxa association. By default, it is assumed that only one individual is sampled per species in gene trees. This option allows multiple alleles to be sampled. For example, the gene tree is (((a1,a2),(b1,b2)),c); and the species tree is ((a,b),c);, the command is -tm <a:a1,a2; b:b1,b2;c:c>. Note that the taxa association should cover all species, e.g. -tm <a:a1,a2; b:b1,b2> is incorrect because c:c is dropped out. optional
-fixtheta thetaFix the population mutation rates associated with all branches of the phylogenetic network to this given value (theta). By default, we estimate a constant population size across all branches.optional-varythetaThe population mutation rates across all branches may be different when estimating them. By default, we estimate a constant population size across all branches.optional
-espthetaEstimate the mean value of prior of population mutation rates.optional
Prior Settings
-pp poissonParamThe Poisson parameter in the prior on the number of reticulation nodes. The default value is 1.0.

optional

-ddDisable the prior on the diameters of hybridizations. By default this prior on is exp(10).optional
-eeEnable the Exponential(10) prior on the divergence times of nodes in the phylogenetic network. By default we use Uniform prior.optional
Starting State Settings
-snetSpecify the starting network. The input network should be ultrametric with divergence times in units of expected number of mutations per site, inheritance probabilities and population sizes in units of population mutation rate (optional). See example below. The default starting network is the MDC trees given starting gene trees. optional
-ptheta startingThetaPriorSpecify the mean value of prior of population mutation rate (startingThetaPrior). The default value is 0.036. If -esptheta is used, startingThetaPrior will be treated as the starting value, otherwise startingThetaPrior will be treated as the fixed mean value of prior of population mutation rates.optional

Data related settings

-diploidSpecify whether sequence sampled from diploids.optional
-dominant dominantMarkerSpecify which marker is dominant if the data is dominant. Either be '0' or '1'.optional
-opSpecify whether or not to ignore all monomorphic sites. If this option is used, the data will be treated as containing only polymorphic sites.optional

 

Example

Download: run_0.nex

 

#NEXUS
Begin data;
Dimensions ntax=5 nchar=10100;
Format datatype=dna symbols="012" missing=? gap=-;
Matrixa 1010101010
b 1100100110
c1 1010100011
c2 1001110100
d 1000011110

 

A_0 1001011010101011001000010101010111001010011001100101111011000011111000001010001001100000110100001011
C_0 1001111011101011001001010101010111011010010001100001111111001000111000001010011001100100100110001011
L_0 1001011010100111001000010101010111001010011001100101111111001011110000001010001001100000110100001011
Q_0 1001011010101011001001010101010111001010011001100101111111001011110000001010001001100000110100001011
R_0 1001011010101011001101010001010111001110011001100101011111001011110000001010101001100000100100001001
;End;
BEGIN PHYLONET;
MCMCMLE_BiMarkers -cl 500000 -bl 200000 -sf 500 -diploid -dominant 1 -op -varytheta -pp 2.0 -ee 2.0 pseudo -mnr 10 -mec 50000 -mno 20 -mf 100 -pi0 0.5 -dd -mr 1 -pl 4 -esptheta -ptheta 0.3
8 -ptheta 0.006 -thetawindow 0.006 -sd 12345678 -taxa (a,b,c1,c2,d)
-tm <A:a; B:b;C:c1,c2;D:d>A_0,C_0,L_0,R_0,Q_0) -tm <A:A_0; C:C_0;L:L_0;Q:Q_0;R:R_0> ;
END;

 

Note that an empty line should be left after "Matrix".

This command will run MCMC chain of 500000 iterations with 200000 burn-in iterations, and one sample will be collected every 500 iterations. The taxa are diploids and 1 is the dominant marker. Only polymorphic sites will be used. maximum pseudolikelihood estimation of 10 iterations with 20 optimal networks printed. And after 100 times of failure to accept a new state, or after 50000 examinations of new states, it will start a new iteration. We will estimate population mutation rates for every all branches, and they may be different. A Poisson prior of 2.0 will be adopted, and a Exponential(2.0) prior will be adopted. are the same across all branches. The number of reticulation nodes is limited to 1. We will sample the mean The starting value of prior of population mutation rates, and the starting value of 0.3 is givenpopulation mutation rate is given by 0.006. We use the random seed of 12345678. In the end, we indicate the mapping from taxa to species.