Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Example with Starting State

Code Block
htmllang
#NEXUS 

Begin data;
	Dimensions ntax=3 nchar=500;
	Format datatype=dna symbols="ACTG" missing=? gap=-;
	Matrix
[0, 500]
A TCGCGCTAACGTCGTTTATAAGTGATCAAAGATAAAAGGAAATCTAAGCTGCCTTCATGTTCCTCATCGGACCTGCACAAGGATGGGCGTGGAGATTCTGGCATGGATACTGTACTTTTACGCGATCGCCCCAGCTACCGACCTCTATAATCACAGGGAATCTCGGGGAACGAATTGCTTCACTAGGTCACACCCGGTTTATAGCCCGTAGAAGTTAGAGCCCGCGAATAAAGGACTAACAACTCTTATCAAGCTAAGGGACATCCTAGAGGGACCTCTGCGGGAGCAGCATGTTGTGTGACTCATCACGGTAAGAACTTGGCAAGCGCGACAGCGGCTAAGCCAGCATGCTAGGCGTCGTCGGATAGTCGCCGTCACGGAATCGGATGAGATCCCTTGAGGGATTGATGATGTTCACATCACTACATGGTTGTTCTGAGTGTTGGTGATCAGGTGCAGCAATTGTGCTTGACGGAAATGGGCTCTCATAACCGAACCCA
C GCGCACCTCCCTCGGATATAAGTGACCGAAGAGAAAAGGGAATCTAATTGGCCCTATCATCACTCATCGTACCTGATCACGTATGGCTGTGGAGATTGCGGCATGGATACTGTACTTTTGAGCGATCATCCCAGTTACCGACCTTCTTAATAAGAGGGAACCTAGGGTAAAGGAATGCTCCACTCCGTCACACGGGGTATATATCCGGAATATGTTAGGCCCCCCGAATGAAGGAGTAAAAACTCTTAACAAGCTCCGACAGATCCTAGGGTATCGTCTGCGGGGCCGGCAGGTCGTGGGACGCATCACGCTAAACACTTGGCAAGCGTGACAGCGGCTGGGTCAGAATGCTCGGCCACGCCGTTTAGTCGCCGGCACCGAATCGAATGTGATCCCTTGAGGAAATGATGAAGTTAACATCATTACATGGGTGCTCTGAGTGATGGTGATAAGGTGGAGGACTTGTGTTTGACGGAAATGGGCTCTGAAAACCGAACTCT
B GCGCACCTACTGCGGATATAAGTGACCGAAGAGTAATGGGAATCTATGCGGCCCTCGCGTCTCTCATCGTACCTGATCAAGTATGGGCGTGGAGATTGTGGCATGGATACTGTACTTTTGAGCCATCATCCCAGTTACCGACCTTCGTAATAAGAGCGAGCCTAGGGGAAAGAAATGCTCCACTCCATCACACCGGGTATATATCCGGAATATGTTCGAGCCCCCGAATAAAGGAGTAAAAACTCTTAACAAGCTCCGAAACATCCTAGGGTATCCTCTGCAGGGACGGCATGTTGTGGGGCCCATCACCCTAAGACCTTTGCAAGCATGAAAGCGGCTCAGCCAGCATGCTCGATCCCGCCGTACAGTCGCCGGCACGGAATCGAGTGTGATCCCCTGAGGAATTGATGAAGTTAACATCACTACTTGGCTGCTCTGAGTGCTGGTGATCAGGTGCAGCACATATGTGTGACGGAAATGGGCACTGACAACCGAACTAT
[1, 500]
A GAAACGGATCTAAGTGTACGGTTTCTCTCGAAGGGGGCACCTTTGCTATGCCCACCCCCATCTTGGAAGTGCGAGACCATACTCGCGCGTGCGTCAGGTTCTTACTTGATTTCGGCGGGGGTGGCTAAATTTTAGCTAGGGATCTAGAAATCCGTCATAGTCCTACAGGGCCATTCTGCCGCTTGCTAGCGTTGGTGATACGAGGGCAACTTTGAACTTTACGCGGAACTCCCCACCTCAGAGACTGTTACGACGTAGGCTAAATGTGCCGTGATTTCTGAGGGCAAAAGCCGTGCAAGGATGGACGGGGGTGCTCAAACAACTGCATCAGCCTCGGCATTATCTTGCATGAGCGCCTTCGATCGGTCACCAGTCGGCTAGATTACAAGCAAGCTCTTCGGAGGAGATGAGCTCGCATGGATCACGCGTCTACGTAACTTTCAGGGTCCATCCAAATGTCAATCATTCACCGAATGGCGATCGTCAGGTACGCGATTCCA
C CGCTCGGATCTAAGTGTACGGTTTCTCTCGAAGGTGGAACCATTGCTATACCCACCCCCATCTTGGAAGTGCCAAACCATTCTCCCAAGAGCGTCGGGTTCTTACTCGATTTCGGCGGGGGTGGCTACAATTTAGGTAGGGATCTAGAAATCGGTTATAATCCTACAAAGCCATTCTGGCGCTTGCTAGTGTTGGTGATACGAGGGCAGCTTTGAACTTTACCGGGAACTGGGCACCTAAGGGACTGTGTCGACGTAGGCTAAATGTGCCGTGATTTCAGCGAGCAAAAGCCATGCAAGATTGGACGGGGGGCCTCAAACAACTGCATCAGCCTCGATATTATCTTGCATGAGCTCCTTCGATCGGTTCCCAGTCGGCTATATTATAAGCAAGCTCTTCGGAGGATATGAGCACGCACGGATTCCGCGTCTACGTAACTTTGAGGGCCCAGCCAGCAGTCAATCATTCAACGAATGGCGATCATAACGAACGCGATTCCA
B CGCTCGGATCTAAGTGTACGGTTTCTCTGGAAGGTGGAACCATTGCTATACCCATCCCCATCTTGGAAGTGCCAGACCATTCTCCCAAGAGCGTCTGGTTCTTACTCGATTTCGGCGGGGGTGGCTACAATTTAGGTAGGGATCTAGAAATCGGTGATAATCGTACAAAGCCATTCTGGCGCTTGCTAGTGTCGGTGATACGAGAGCAGCTTTGAACTTTACCCGGAACTGCGCACCTAAGGGACTGTGTCGACGTAGGCTAAATGTGCCGTGATTTCAGCGAGCAAAAGCCATGCAAGATTGGACGGGCGGCCTCAAACAACTGCATCAGCCTCGATATTATCTTGCATGAGCTCCTTCGATCGGTTCCCAGTCGGCTATCTTATAAGCAAGCTCTTCGGAGGATATGAGCACGCACGGATTCCGCGTCTACGTAACTTTGAGGGCCCAGCCAGCAGTCAATCATTGACCGAATGGCGATCATAACGAACGCGATTCCA
;End;

BEGIN TREES;
Tree gt0 = (A:0.119900443,(C:0.058838639,B:0.058838639):0.061061803);
Tree gt1 = (A:0.068766378,(C:0.016229589,B:0.016229589):0.052536789);
END;

BEGIN NETWORKS;
Network net1 = (((B:0.0)I3#H1:0.05::0.8,(C:2.0E-8,I3#H1:2.0E-8::0.2)I2:0.04999998)I1:0.01,A:0.06)I0;
END;

BEGIN PHYLONET;  
MCMC_SEQ -cl 50000 -bl 10000 -sgt (gt0,gt1) -snet net1 -sps 0.04 -pre 20;
END;

 

 Example given Missing Data

Code Block
htmllang
#NEXUS 
 
Begin data;
	Dimensions ntax=5 nchar=108;
	Format datatype=dna symbols="ACTG" missing=? gap=-;
	Matrix
[loci1, 53, ...]
a1	ATTGGAGACRAGCGARGACCGAGCTCACGAACCTGAGGAATGGAATCGATTAC
a2	ATTTGAGACRAGCGARGACCGAGCTCACGAACCTGAGGANTGGAATCGATTAC
b1	TTGGGAGACGAGCGAAGACAGAGCATATGAGCCTAAGGATTGGAATCGATTGT
b2	TTGGGAGACGAGCGAAGACAGAGCATATGAGCCTGAGGATTGGAATCGATTGT
[loci2, 58, ...]
a2	ACTTTGCAAGCCAAAAATGGTATGCGAGACAACGCCTGTCATGGATGATGAACCAGAT
b1	GCTTTGCAAGCCTAAGATGGTTTGCGAGACGACGATGGCAGTCGACGATGAATCAGAC
b2	GCTTTGCAAGCCTAAGATGGTTTGCGAGACGACGATGGCAGTCGACGATGAATCAGAC
c1	GCTTTGRAAGRCAAAAATGATATGCGAAACAACGCCCGTGATGGACGATGAACAGGAT
;End;
BEGIN PHYLONET;  
MCMC_SEQ -loci (loci1,loci2) -cl 5000000 -bl 1000000 -tm <A:a1,a2; B:b1,b2; C:c1>;
END;



Understanding the Output

System Output

  • Logger: each time a sample is collected, the program prints out the Posterior value, current ESS (Effective Sample Size) based on the posterior values, likelihood value, prior value, current ESS based on the prior values, and the sampled phylogenetic network. Note the value in the brackets is the population size.
  • Summarization: the program prints out the chain length, burn-in length, sample frequency and the overall acceptance rate of proposals.
  • Operations: the usage and the acceptance rate for each operation.
  • Topologies: the MAP (Maximum A Posterior) topology is given. For each unique topology, the network with the maximum posterior value and the averaged (branch lengths and inheritance probabilities) network are printed out. The topologies are ranked on their posterior probabilities.
  • Run time: the elapsed time.

 

MCMC_SEQ -cl 250000 -bl 50000 -sf 5000

----------------------- Logger: -----------------------
Iteration; Posterior; ESS; Likelihood; Prior; ESS; #Reticulation
0; -257.55069; 0.00000; -263.27861; 5.72791; 0.00000; 0;
[0.036]((((Scer:0.00857375,Spar:0.00857375):4.5125000000000026E-4,Skud:0.009025):4.7499999999999973E-4,Sbay:0.0095):0.11472678834065418,Smik:0.12422678834065418);
......
50; -176.50732; 10.65831; -181.15553; 4.64822; 11.84238; 0;
[0.017160158027924775](((Sbay:0.01968119866828454,Skud:0.01968119866828454):0.042016035724419504,(Spar:0.04364900393745317,Scer:0.04364900393745317):0.018048230455250877):0.01662669337541752,Smik:0.07832392776812157);
----------------------- Summarization: -----------------------
Burn-in = 50000, Chain length = 250000, Sample size = 40, Acceptance rate = 0.10274
--------------- Operations ---------------
Operation:NarrowNNI; Used:34781; Accepted:3750 ACrate:0.10781748655875334
Operation:Swap-Nodes; Used:5273; Accepted:155 ACrate:0.02939503129148492
Operation:SubtreeSlide; Used:34904; Accepted:3566 ACrate:0.10216594086637634
......

Overall MAP = -139.6655535361708

(((Spar:0.054401097303896875,Scer:0.054401097303896875):0.02940261095452569,Smik:0.08380370825842257):0.015640290731517764,(Sbay:0.038489186677349164,Skud:0.038489186677349164):0.060954812312591165);
-------------- Top Topologies: --------------
Rank = 0; Size = 20; Percent = 048.4878047804; MAP = -139.6655535361708:(((Spar:0.054401097303896875,Scer:0.054401097303896875):0.02940261095452569,Smik:0.08380370825842257):0.015640290731517764,(Sbay:0.038489186677349164,Skud:0.038489186677349164):0.060954812312591165); Ave=-159.81967227297005; ((Smik:0.07802648460532205,(Scer:0.04734369459293139,Spar:0.04734369459293139):0.03068279001239066):0.012243968912293374,(Skud:0.0399365140411103,Sbay:0.0399365140411103):0.05033393947650512);
Rank = 1; Size = 16; Percent = 039.3902430243; MAP = -150.1407504811838:(Smik:0.08832671142241318,((Sbay:0.0574884656789708,Skud:0.0574884656789708):0.029947862652692656,(Spar:0.05271204611595535,Scer:0.05271204611595535):0.034724282215708106):8.903830907497218E-4); Ave=-171.0422748749801; (Smik:0.09785346027572299,((Sbay:0.040857883347008524,Skud:0.040857883347008524):0.03552943228704832,(Scer:0.055009891356695276,Spar:0.055009891356695276):0.02137742427736157):0.021466144641666143);
......

Total elapsed time : 27.35100 s

Sample Files

The phylogenetic network, gene trees and the hyper-parameter of the population size are logged into files under your home directory or the directory specified by "-dir outDirectory".

  • Phylogenetic Network: ~/outDirectory/network.log
  • Hyper-parameter of Population size: ~/outDirectory/popSizePrior.log
  • Gene tree: ~/outDirectory/tree_locusName.log

Downloads

  • example.zip
    • example.nexus: input file for PhyloNet
    • example.txt: system output
    • network.log, popSizePrior.log, tree_YAL053W.log, tree_YAR007C.log, tree_YBL015W.log: sample files
  • The yeast data set (Rokas et al., 2003) sampled from seven Saccharomyces species S. cerevisiae (Scer), S. paradoxus (Spar), S. mikatae (Smik), S. kudriavzevii (Skud), S. bayanus (Sbay), S. castellii (Scas) and S. kluyveri (Sklu)
    • 106-locus
    • 28-locus (with strong phylogenetic signals)
    • 106-locus restricted by five Saccharomyces species ScerSparSmikSkud and Sbay.
  • The wheat data set (Marcussen et al., 2014) sampled from hexaploid bread wheat subgenomes T. aestivum TaA (A subgenome), TaB (B subgenome) and TaD (D subgenome), and five diploid relatives T. monococcum (Tm), T. urartu (Tu), Ae. sharonensis (Ash), Ae. speltoides (Asp) and Ae. tauschii (At)
  • The mosquito data set (Fontaine et al., 2014) sampled from six Anopheles species An. gambiae (G), An. coluzzii (C), An. arabiensis (A), An. quadriannulatus (Q), An. merus (R) and An. melas (L)
    • 228-locus from X chromosome
    • 59-locus (with strong phylogenetic signals) from X chromosome
    • 382-locus (with strong phylogenetic signals) from autosomes

Command References

  1. D.Wen and L. Nakhleh. Co-estimating reticulate phylogenies and gene trees on sequences from multiple independent loci. Submitted
  2. Gronau, Ilan, et al. Bayesian inference of ancient human demography from individual genome sequences. Nature genetics 43.10 (2011): 1031-1034.

See Also