Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Convert the log file generated by our MCMC _SEQ sampling method to be readable by Tracer.

PhyloNet Version: 3.6.3

Usage

 

SummarizeMCMCResults -mode "Tracer" [-cl chainLength] [-bl burnInLength] [-sf sampleFrequency] [-outfile filename] [-truenet networkString]

Settings

-cl chainLength

The length of the MCMC chain. Set to the same value used in MCMC_SEQ.

required

-bl burnInLengthThe number of iterations in burn-in period. Set to the same value used in MCMC_SEQ.required

-sf sampleFrequency

The sample frequency. Set to the same value used in MCMC_SEQ.

required

-outfile filenameThe name of tracer log file you want to generate. Can contain path.required
-truenet networkString The network you want to summarize its parameters.required
MC3 Settings
-mc3 temperatureList

The list of temperatures for the Metropolis-coupled MCMC chains. For example, -mc3 (2.0, 3.0)indicates two hot chains with temperatures 2.0 and 3.0 respectively will be run along with the cold chain with temperature 1.0. By default only the cold chain will be run. Note that

  • The temperatures should be DIFFERENT! For example, -mc3 (2.0, 2.0, 3.0) is invalid.
  • The temperature of the cold chain should NOT be included. For example, -mc3 (1.0, 2.0, 3.0) is incorrect.
  • Metropolis-coupled MCMC leads to faster convergence and better mixing, however, the running time increases linearly with the number of chains. We suggest you first run a standard MCMC chain (cold chain) without this command. If the trace plot indicates the chain is not mixed well (jagged, stuck in local maxima for a long time), then try this command.
optional
Inference Settings
-mr maxReticulationThe maximum number of reticulation nodes in the sampled phylogenetic networks. The default value is 4.optional
-taxa taxaListThe taxa used for inference. For example, -taxa (a,b,c)required
-tm taxonMapGene tree / species tree taxa association. By default, it is assumed that only one individual is sampled per species in gene trees. This option allows multiple alleles to be sampled. For example, the gene tree is (((a1,a2),(b1,b2)),c); and the species tree is ((a,b),c);, the command is -tm <a:a1,a2; b:b1,b2;c:c>. Note that the taxa association should cover all species, e.g. -tm <a:a1,a2; b:b1,b2> is incorrect because c:c is dropped out. optional
-fixtheta thetaFix the population mutation rates associated with all branches of the phylogenetic network to this given value (theta). By default, we estimate a constant population size across all branches.optional
-varythetaThe population mutation rates across all branches may be different when estimating them. By default, we estimate a constant population size across all branches.optional
-espthetaEstimate the mean value of prior of population mutation rates.optional
Prior Settings
-pp poissonParamThe Poisson parameter in the prior on the number of reticulation nodes. The default value is 1.0.

optional

-ddDisable the prior on the diameters of hybridizations. By default this prior on is exp(10).optional
-eeEnable the Exponential(10) prior on the divergence times of nodes in the phylogenetic network. By default we use Uniform prior.optional
Starting State Settings
-snetSpecify the starting network. The input network should be ultrametric with divergence times in units of expected number of mutations per site, inheritance probabilities and population sizes in units of population mutation rate (optional). See example below. The default starting network is the MDC trees given starting gene trees. optional
-ptheta startingThetaPriorSpecify the mean value of prior of population mutation rate (startingThetaPrior). The default value is 0.036. If -esptheta is used, startingThetaPrior will be treated as the starting value, otherwise startingThetaPrior will be treated as the fixed mean value of prior of population mutation rates.optional

Data related settings

-diploidSpecify whether sequence sampled from diploids.optional
-dominant dominantMarkerSpecify which marker is dominant if the data is dominant. Either be '0' or '1'.optional
-opSpecify whether or not to ignore all monomorphic sites. If this option is used, the data will be treated as containing only polymorphic sites.optional

 

Example

 

#NEXUS
Begin data;
Dimensions ntax=5 nchar=10;
Format datatype=dna symbols="012" missing=? gap=-;
Matrix

a 1010101010
b 1100100110
c1 1010100011
c2 1001110100
d 1000011110

;End;

BEGIN PHYLONET;
MCMC_BiMarkers -cl 500000 -bl 200000 -sf 500 -diploid -dominant 1 -op -varytheta -pp 2.0 -ee 2.0 -mr 1 -pl 4 -esptheta -ptheta 0.3
-sd 12345678

-taxa (a,b,c1,c2,d)
-tm <A:a; B:b;C:c1,c2;D:d>;

END;

 

 

Example

 

If we run MCMCseq_example1.nex , which is one of the example in the description of command MCMC_SEQ, we will get following output. Suppose it is printed into file example_mcmcseq_1.out.

Then the content of example_mcmcseq_1.out looks like:

 

 

MCMC_SEQ -cl 50000 -bl 10000 -sgt (gt0, gt1) -snet (net1) -sps 0.04 -pre 20
Output files under /Users/zhujiafan
SET _POP_SIZE_MEAN = 0.046099144360160785
SET _POP_SIZE_WINDOW_SIZE = 0.0023049572180080394
SET _TIME_WINDOW_SIZE = 0.0025150755969370955
----------------------- Logger: -----------------------
Iteration; Posterior; ESS; Likelihood; Prior; ESS; #Reticulation
1; -2370.95407; 0.00000; -2370.64467; -0.30939; 0.00000; 1;
[0.03712369495639094](((C:0.023825648320444256,B:0.023825648320444256)I2:0.050125924499740626)I3#H1:0.4705848357733278::0.4977571578778074,(A:0.09748469830982424,I3#H1:0.02353312548963936::0.5022428421221926)I1:0.44705171028368845)I0;
2; -2370.02122; 0.00000; -2365.48071; -4.54051; 0.00000; 1;
[0.08740412526529193](((A:0.0644782523371697)I3#H1:0.011085100576159218::0.7394821248524521,(B:0.012295023776453121,C:0.012295023776453121)I2:0.06326832913687579)I1:0.7099466953690496,I3#H1:0.7210317959452088::0.2605178751475479)I0;
3; -2372.45184; 1.00000; -2368.78300; -3.66884; 1.00000; 1;
[0.09033941530851464](((A:0.045146002769816694)I3#H1:0.007585944777036023::0.8848743613665822,(C:0.01813233140385818,B:0.01813233140385818)I2:0.03459961614299453)I1:3.898141499134046,I3#H1:3.905727443911082::0.11512563863341785)I0;
4; -2367.92188; 2.00000; -2365.28211; -2.63977; 2.00000; 1;
[0.07069491916656903](((B:0.00872426064648134,C:0.00872426064648134)I2:0.06083313993717342,(A:0.006588553878875061)I3#H1:0.06296884670477969::0.8996115971738128)I1:5.372106361639526,I3#H1:5.435075208344306::0.10038840282618722)I0;
5; -2370.70912; 3.00000; -2367.77665; -2.93247; 3.00000; 1;
[0.06524159621009873](((B:0.0020694551571076533,C:0.0020694551571076533)I2:0.05700062041463809,(A:0.05752026401374041)I3#H1:0.0015498115580053276::0.6506663662068655)I1:3.895283178099869,I3#H1:3.8968329896578746::0.34933363379313453)I0;
6; -2370.17278; 4.00000; -2367.72976; -2.44302; 4.00000; 1;
[0.08326098635086286](((B:0.01573581643597444,C:0.01573581643597444)I2:0.054166996949172486,(A:0.05445033542656785)I3#H1:0.01545247795857907::0.2708973259263968)I1:3.1609361182119855,I3#H1:3.1763885961705647::0.7291026740736032)I0;
7; -2375.23091; 5.00000; -2370.29901; -4.93190; 5.00000; 1;
[0.05379778176231088]((((C:0.00999397908818145,B:0.00999397908818145)I2:0.08374945041951844)I3#H1:0.006697029242945116::0.7206438532760768,A:0.100440458750645)I1:14.193400722603458,I3#H1:14.200097751846403::0.27935614672392317)I0;
8; -2367.29436; 6.00000; -2366.26747; -1.02688; 6.00000; 1;
[0.03543271735376193](((B:0.005849707761867801,C:0.005849707761867801)I2:0.07841641875267599,(A:0.01271426487814302)I3#H1:0.07155186163640077::0.8622136520013726)I1:4.6994134208375336,I3#H1:4.770965282473934::0.13778634799862743)I0;
9; -2373.22179; 7.00000; -2368.15899; -5.06280; 7.00000; 1;
[0.06618446114985434]((A:0.06301458366842101,((B:0.0020198944585432693,C:0.0020198944585432693)I2:0.00993963859785437)I3#H1:0.051055050612023374::0.8453466268608258)I1:5.141614377672805,I3#H1:5.1926694282848285::0.15465337313917416)I0;
10; -2378.18933; 8.00000; -2366.39462; -11.79471; 8.00000; 2;
[0.09440157575724482]((((((C:0.005310928295931936,B:0.005310928295931936)I2:0.05954064895844466,A:0.0648515772543766)I1:16.126610263480238)#H1:6.7365747769686415::0.20921855279391943)#H2:1.3049036309951738::0.8790975883804708,#H1:8.041478407963815::0.7907814472060806):0.9036343768434847,#H2:2.2085380078386585::0.1209024116195292)I0;
----------------------- Summarization: -----------------------
Burn-in = 10000, Chain length = 50000, Sample size = 8, Acceptance rate = 0.70918
--------------- Operations ---------------
Operation:NarrowNNI; Used:20970; Accepted:0 ACrate:0.0
Operation:Swap-Nodes; Used:2783; Accepted:611 ACrate:0.21954725116780452
Operation:SubtreeSlide; Used:20754; Accepted:6081 ACrate:0.29300375831165076
Operation:Scale-Time; Used:2201; Accepted:1484 ACrate:0.6742389822807815
Operation:Add-Reticulation; Used:258; Accepted:4 ACrate:0.015503875968992248
Operation:Slide-SubNet; Used:13653; Accepted:4412 ACrate:0.32315242071339634
Operation:TreeScaler; Used:5202; Accepted:2824 ACrate:0.5428681276432141
Operation:Flip-Reticulation; Used:804; Accepted:22 ACrate:0.02736318407960199
Operation:Change-PopSize-Prior-Param; Used:437; Accepted:220 ACrate:0.5034324942791762
Operation:Scale-Root-Time; Used:2722; Accepted:2350 ACrate:0.8633357825128581
Operation:Scale-PopSize-Prior-Param; Used:367; Accepted:327 ACrate:0.8910081743869209
Operation:Scale-All; Used:620; Accepted:72 ACrate:0.11612903225806452
Operation:Move-Head; Used:914; Accepted:163 ACrate:0.17833698030634573
Operation:TreeRootScaler; Used:5128; Accepted:1579 ACrate:0.3079173166926677
Operation:WildNNI; Used:5090; Accepted:0 ACrate:0.0
Operation:TNodeReheight; Used:41825; Accepted:6414 ACrate:0.15335325762104005
Operation:Change-Time; Used:15414; Accepted:7158 ACrate:0.464383028415726
Operation:Scale-PopSize; Used:549; Accepted:478 ACrate:0.8706739526411658
Operation:Change-Inheritance; Used:797; Accepted:443 ACrate:0.5558343789209536
Operation:Move-Tail; Used:2360; Accepted:26 ACrate:0.011016949152542373
Operation:Delete-Reticulation; Used:251; Accepted:3 ACrate:0.01195219123505976
Operation:WilsonBalding; Used:5250; Accepted:0 ACrate:0.0
Operation:Change-PopSize; Used:1651; Accepted:788 ACrate:0.47728649303452453

Overall MAP = -2367.294357178221
(((B:0.005849707761867801,C:0.005849707761867801)I2:0.07841641875267599,(A:0.01271426487814302)I3#H1:0.07155186163640077::0.8622136520013726)I1:4.6994134208375336,I3#H1:4.770965282473934::0.13778634799862743)I0;
-------------- 95% credible set of topologies --------------
Rank = 0; Size = 5; Percent = 62.50; MAP = -2367.294357178221:(((B:0.005849707761867801,C:0.005849707761867801)I2:0.07841641875267599,(A:0.01271426487814302)I3#H1:0.07155186163640077::0.8622136520013726)I1:4.6994134208375336,I3#H1:4.770965282473934::0.13778634799862743)I0; Ave=-2369.709994467344; ((A:0.03528388419342861)I3#H1:4.236997904111553::0.26626965889975657,((B:0.010102314281057883,C:0.010102314281057883)I2:0.057003358439330895,I3#H1:0.031821788526960174::0.7337303411002434)I1:4.205176115584593)I0;
Rank = 1; Size = 2; Percent = 25.00; MAP = -2373.221789051192:((A:0.06301458366842101,((B:0.0020198944585432693,C:0.0020198944585432693)I2:0.00993963859785437)I3#H1:0.051055050612023374::0.8453466268608258)I1:5.141614377672805,I3#H1:5.1926694282848285::0.15465337313917416)I0; Ave=-2374.2263488001636; (((B:0.0060069367733623594,C:0.0060069367733623594)I2:0.04684454450868641)I3#H1:9.696383590065615::0.13967807336196159,(A:0.081727521209533,I3#H1:0.028876039927484234::0.8603219266380384)I1:9.667507550138131)I0;
Rank = 2; Size = 1; Percent = 12.50; MAP = -2378.189329515679:((((((C:0.005310928295931936,B:0.005310928295931936)I2:0.05954064895844466,A:0.0648515772543766)I1:16.126610263480238)#H1:6.7365747769686415::0.20921855279391943)#H2:1.3049036309951738::0.8790975883804708,#H1:8.041478407963815::0.7907814472060806):0.9036343768434847,#H2:2.2085380078386585::0.1209024116195292)I0; Ave=-2378.189329515679; ((((A:0.0648515772543766,(B:0.005310928295931936,C:0.005310928295931936)I2:0.05954064895844466)I1:16.126610263480238)#H2:6.7365747769686415::0.20921855279391943)#H1:2.2085380078386585::0.1209024116195292,(#H2:8.041478407963815::0.7907814472060806,#H1:1.3049036309951738::0.8790975883804708):0.9036343768434847)I0;

Total elapsed time : 23.77700 s

 

Suppose we want to analyze the parameters of sampled top network using Tracer, we can write such a NEXUS file. Note that the "truenet" string is copied from the "Rank = 0" network. And remember to put the log file into SETS section.

 

 

#NEXUS

BEGIN SETS;
/Users/zhujiafan/Documents/BioinfoData/Report/example_mcmcseq_1.out
END;

BEGIN PHYLONET;
SummarizeMCMCResults -cl 50000 -bl 10000 -sf 5000
-mode "Tracer"
-outfile "/Users/zhujiafan/Documents/BioinfoData/Report/report.txt"
-truenet "(((B:0.005849707761867801,C:0.005849707761867801)I2:0.07841641875267599,(A:0.01271426487814302)I3#H1:0.07155186163640077::0.8622136520013726)I1:4.6994134208375336,I3#H1:4.770965282473934::0.13778634799862743)I0; ";
END;

 

 Run this NEXUS file using PhyloNet. PhyloNet will print a string showing how nodes are labeled, and report.txt is generated, which is readable by Tracer. 

Following is what we will see after import report.txt into Tracer. In the column of Statistic, gamma represents inheritance probability, tau represents branch length, and theta represents population mutation rate.

Image Added

 

Note that an empty line should be left after "Matrix".

This command will run MCMC chain of 500000 iterations with 200000 burn-in iterations, and one sample will be collected every 500 iterations. The taxa are diploids and 1 is the dominant marker. Only polymorphic sites will be used. We will estimate population mutation rates for every branches, and they may be different. A Poisson prior of 2.0 will be adopted, and a Exponential(2.0) prior will be adopted. The number of reticulation nodes is limited to 1. We will sample the mean value of prior of population mutation rates, and the starting value of 0.3 is given. We use the random seed of 12345678. In the end, we indicate the mapping from taxa to species.