Description

Bayesian estimation of the posterior distribution of phylogenetic networks given a list of gene tree topologies.

Usage

 

Bayesian Inference

MCMC_GT geneTreeList [-cl chainLength] [-bl burnInLength] [-sf sampleFrequency] [-sd seed] [-pp poissonParameter] [-mr maximumReticulation] [-pl parallelThreads] [-tp temperatureList] [-sn startingNetworkList] [-tm taxonMap] [-pseudo]
geneTreeListComma delimited list of gene tree identifiers or comma delimited list of sets of gene tree identifiers. See details.mandatory

-cl chainLength

The length of the MCMC chain. The default value is 1,100,000.

optional

-bl burnInLengthThe number of iterations in burn-in period. The default value is 100,000.optional

-sf sampleFrequency

The sample frequency. The default value is 1,000.

optional

-sd seedThe random seed. The default seed is 12345678optional

-pp poissonParameter

The Poisson parameter in the prior on the number of reticulation nodes. The default value is 1.0

optional

-mr maximumReticulationThe maximum number of reticulation nodes in the sampled phylogenetic networks. The default value is infinity.optional
-pl parallelThreads The number of threads running in parallel. The default value is 1.optional
-tp temperatureListThe list of temperatures for the Metropolis-coupled MCMC chains. For example, a list (1.0, 2.0, 3.0) indicates a cold chain with temperature 1.0 and two hot chains with temperatures 2.0 and 3.0 respectively will be run. The first value in the list should always be 1.0. The default list is (1.0).optional
-sn startingNetworkListComma delimited list of network identifiers. See details. If the list contains only one network, the network will be used as the starting network for all the chains. If the length of the list equals to the the length of the temperature list, each chain will have a corresponding starting network. The default list is empty and the MDC tree will be used as the starting network.optional
-tm taxonMapGene tree / species tree taxa association. By default, it is assumed that only one individual is sampled per species in gene trees. However, this option allows multiple alleles to be sampled.optional
-pseudoUse pseudo likelihood instead of full likelihood to reduce runtime, see details.optional

Summarization

 

MCMC_GT -sum fileList
fileListThe list of output files from Bayesian Inference.mandatory

 

Examples

Bayesian Inference

 

#NEXUS

BEGIN TREES;

Tree gt1 = (((DF,MZ),(DG,MK)),MC);
Tree gt2 = (((DF,(MK,MZ)),DG),MC);
Tree gt3 = (((DF,MK),(MC,MZ)),DG);
Tree gt4 = ((DF,(MC,(MK,MZ))),DG);
Tree gt5 = (DF,((DG,(MC,MZ)),MK));
Tree gt6 = (((DF,DG),MK),(MC,MZ));
Tree gt7 = ((((DF,DG),MC),MZ),MK);
Tree gt8 = (((DF,MK),MZ),(DG,MC));
Tree gt9 = ((DF,(DG,MC)),(MK,MZ));
Tree gt10 = (((DF,MC),(DG,MK)),MZ);

END;

BEGIN PHYLONET;
MCMC_GT (gt1-gt10) -cl 10000 -bl 5000 -sf 1000;
END;

#NEXUS

 

BEGIN NETWORKS;
Network net1 = (((F:1,K:1):1,G:1):1,C:1):1;
Network net2 = ((F:1,(C:1,K:1):1):1,G:1):1;
END;

BEGIN TREES;

Tree gt1 = (((DF,MZ),(DG,MK)),MC);
Tree gt2 = (((DF,(MK,MZ)),DG),MC);
Tree gt3 = (((DF,MK),(MC,MZ)),DG);
Tree gt4 = ((DF,(MC,(MK,MZ))),DG);
Tree gt5 = (DF,((DG,(MC,MZ)),MK));
Tree gt6 = (((DF,DG),MK),(MC,MZ));
Tree gt7 = ((((DF,DG),MC),MZ),MK);
Tree gt8 = (((DF,MK),MZ),(DG,MC));
Tree gt9 = ((DF,(DG,MC)),(MK,MZ));
Tree gt10 = (((DF,MC),(DG,MK)),MZ);

END;

BEGIN PHYLONET;
MCMC_GT ({gt1-gt2}, {gt3-gt4}, {gt5-gt6}, {gt7-gt8}, {gt9-gt10}) -cl 10000 -bl 5000 -sf 1000 -sd 12345 -pp 0.1 -mr 3 -pl 4 -tp (1.0,2.0) -sn (net1,net2) -tm <F:DF;G:DG;K:MK;C:MC,MZ>;
END;

Summarization

#NEXUS

BEGIN PHYLONET;
MCMC_GT -sum (mcmc1.txt, mcmc2.txt, mcmc3.txt, mcmc4.txt);
END;

Understanding the Output

Bayesian Inference

 

Output of the second example

 

----------------------- Logger -----------------------

Iteration;    Posterior;  ESS;    Likelihood;    Prior;  ESS;    #Reticulation

0;    -32.31804;    0.00000;    -26.31804;   -6.00000;    0.00000;    0;

(C:1.0,(F:1.0,(G:1.0,K:1.0):1.0):1.0);

1;    -30.67945;    0.00000;    -24.36941;   -6.31004;    0.00000;    0;

(((F:1.100296825380626,G:3.091462635493304):0.0672071335563177,C:1.2631834608546109):0.022439620222693815,K:0.7654486703017418);

......

9;    -39.23243;    5.00000;    -24.51469;   -14.71774;    5.00000;    0;

(C:1.303229359865713,(G:1.184262389767834,(F:9.355326885355721,K:1.9913949707859038)I0:0.8785013943118355)I3:0.005027219859507291)I2;

10;    -47.01271;    6.00000;    -23.75691;   -23.25580;    6.00000;    1;

((((G:0.6775006643112376)I1#H1:0.10254403746151251::0.6222138281679854,(K:0.40305290671104776,F:11.774054935191478)I3:1.0127444936379693)I4:0.11119651081773953,C:0.6245266587065907)I2:0.24534630112752273,I1#H1:0.6332472675472998::0.3777861718320146)I0;

----------------------- Summarization: -----------------------

Burn-in = 5000, Chain length = 10000, Sample size = 5 Acceptance rate = 0.62080 

         --------------- Operations ---------------

Operation:Move-Head; Used:196; Accepted:35 ACrate:0.17857142857142858

Operation:Change-Length; Used:5782; Accepted:5628 ACrate:0.9733656174334141

Operation:Add-Reticulation; Used:92; Accepted:1 ACrate:0.010869565217391304

Operation:Flip-Reticulation; Used:191; Accepted:44 ACrate:0.23036649214659685

Operation:Change-Probability; Used:172; Accepted:165 ACrate:0.9593023255813954

Operation:Move-Tail; Used:3564; Accepted:332 ACrate:0.0931537598204265

Operation:Delete-Reticulation; Used:3; Accepted:3 ACrate:1.0

 

Overall MAP = -39.13178296704008

(((G:1.8467353971326255,F:3.669429912707025)I4:0.72972613976057,C:0.7672463997220461)I0:0.14233512944191162,K:7.34631035390478)I3;

         -------------- 95% credible set of topologies: --------------

Rank = 0; Size = 2; Percent = 40.00; MAP = -39.23243337560144:(C:1.303229359865713,(G:1.184262389767834,(F:9.355326885355721,K:1.9913949707859038)I0:0.8785013943118355)I3:0.005027219859507291)I2; Ave=-39.25394715434493; (C:1.20916762172346,(G:2.057607023787008,(K:4.61838854181666,F:6.605842763421295)I0:0.5911795016873735)I4:0.1657841343330578)I3;

Rank = 1; Size = 1; Percent = 20.00; MAP = -47.01271365298103:((((G:0.6775006643112376)I1#H1:0.10254403746151251::0.6222138281679854,(K:0.40305290671104776,F:11.774054935191478)I3:1.0127444936379693)I4:0.11119651081773953,C:0.6245266587065907)I2:0.24534630112752273,I1#H1:0.6332472675472998::0.3777861718320146)I0; Ave=-47.01271365298103; ((G:0.6775006643112376)I1#H1:0.6332472675472998::0.3777861718320146,(C:0.6245266587065907,((F:11.774054935191478,K:0.40305290671104776)I3:1.0127444936379693,I1#H1:0.10254403746151251::0.6222138281679854)I4:0.11119651081773953)I2:0.24534630112752273)I0;

Rank = 2; Size = 1; Percent = 20.00; MAP = -39.13178296704008:(((G:1.8467353971326255,F:3.669429912707025)I4:0.72972613976057,C:0.7672463997220461)I0:0.14233512944191162,K:7.34631035390478)I3; Ave=-39.13178296704008; (K:7.34631035390478,(C:0.7672463997220461,(F:3.669429912707025,G:1.8467353971326255)I4:0.72972613976057)I0:0.14233512944191162)I3;

Rank = 3; Size = 1; Percent = 20.00; MAP = -40.2382981193348:(((F:7.823288510084807,K:6.50888387662072)I0:0.32861454580215427,C:0.8862007905628368)I4:0.1556168196177095,G:1.218111399301718)I3; Ave=-40.2382981193348; (G:1.218111399301718,(C:0.8862007905628368,(K:6.50888387662072,F:7.823288510084807)I0:0.32861454580215427)I4:0.1556168196177095)I3; 

Total elapsed time : 31.53400 s

Summarization

---------------- Topologies ----------------

PosteriorProbability Topology

0.4500 ((TaB:0.2658858620952823)I2#H1:0.7341141379047177::0.8708100650901477,(TaA:1.0,(TaD:0.32466899225557677,I2#H1:1.3465736635156926::0.1291899349098523)I3:2.4002600314073748)I1:0.19747493612759992)I0;
0.2500 ((TaD:1.0,TaA:1.0):1.0,TaB:1.0);
0.1500 (((TaD:1.57737312713865)I3#H1:0.7217516903113591::0.42596342801648157,TaB:0.3702926554366514)I2:0.5822736350667996,(I3#H1:0.8251033741688848::0.5740365719835184,TaA:0.5942441726893598)I1:0.6915888106669655)I0;
0.1500 ((((TaA:0.2655477426293258)I1#H1:0.11325339702702007::0.4241273916272569,TaD:1.935302259486147)I3:0.786549353980848,TaB:0.40541038525824113)I2:0.2974313193957629,I1#H1:1.0824385015100584::0.5758726083727431)I0;
---------------- PSRF ----------------
Posterior - 1.0498673507956626
Likelihood - 1.09497860895306455
Prior - 1.0576479101618842

---------------- Sojourn ----------------
Topology Frequency Posterior #sojourns max(sjs) ave(sjs)
1 & 4 & 0.4000 & 2 & 3 & 2.00 \\
2 & 4 & 0.4000 & 1 & 4 & 4.00 \\
3 & 1 & 0.1000 & 1 & 1 & 1.00 \\
4 & 1 & 0.1000 & 1 & 1 & 1.00 \\
Topology Frequency Posterior #sojourns max(sjs) ave(sjs)
1 & 8 & 0.8000 & 1 & 8 & 8.00 \\
2 & 2 & 0.2000 & 1 & 2 & 2.00 \\
3 & 0 & 0.0000 & 0 & 0 & NaN \\
4 & 0 & 0.0000 & 0 & 0 & NaN \\
Topology Frequency Posterior #sojourns max(sjs) ave(sjs)
1 & 4 & 0.4000 & 2 & 2 & 2.00 \\
2 & 2 & 0.2000 & 1 & 2 & 2.00 \\
3 & 1 & 0.1000 & 1 & 1 & 1.00 \\
4 & 3 & 0.3000 & 2 & 2 & 1.50 \\
Topology Frequency Posterior #sojourns max(sjs) ave(sjs)
1 & 2 & 0.2000 & 2 & 1 & 1.00 \\
2 & 2 & 0.2000 & 1 & 2 & 2.00 \\
3 & 4 & 0.4000 & 3 & 2 & 1.33 \\
4 & 2 & 0.2000 & 2 & 1 & 1.00 \\

---------------- SRQ Plot ----------------
y0 = [0.25,0.5,0.75,1.0];
x0 = [0.5,0.6,0.7,1.0];
y1 = [0.125,0.25,0.375,0.5,0.625,0.75,0.875,1.0];
x1 = [0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0];
y2 = [0.25,0.5,0.75,1.0];
x2 = [0.3,0.4,0.8,0.9];
y3 = [0.5,1.0];
x3 = [0.5,0.7];
---------------- Trace Plot ----------------

y0 = [-2736.2903,-2487.17354,-2464.78381,-2465.97101,-2452.65531,-2443.28357,-2439.46916,-2441.01877,-2439.93589,-2442.0554,];

y1 = [-2736.2903,-2469.68094,-2438.08212,-2438.52812,-2439.02494,-2438.4918,-2439.33072,-2438.70721,-2438.68125,-2438.68995,];
y2 = [-2736.2903,-2474.94649,-2446.39497,-2438.87233,-2440.56426,-2442.1411,-2438.8736,-2439.18374,-2438.65952,-2439.23713,];
y3 = [-2736.2903,-2466.11338,-2451.75728,-2439.36378,-2439.58731,-2437.75625,-2439.12275,-2438.99142,-2439.20819,-2440.1315,]; 

 

Command References

See Also