Mixture Tree Construction and Its Applications

A new method for building a gene tree from Single Nucleotide Polymorphism (SNP) data was developed by Chen and Lindsay (Biometrika 93(4):843–860, 2006). Called the mixture tree, it was based on an ancestral mixture model. The sieve parameter in the model plays the role of time in the evolutionary tree of the sequences. By varying the sieve parameter, one can create a hierarchical tree that estimates the population structure at each fixed backward point in time. In this chapter, we will review the model and then present an application to the clustering of the mitochondrial sequences to show that the approach performs well. A simulator that simulates real SNPs sequences with unknown ancestral history will be introduced. Using the simulator we will compare the mixture trees with true trees to evaluate how well the mixture tree method performs. Comparison with some existing methods including neighbor-joining method and maximum parsimony method will also be presented in this chapter.

[1]  Li Jin,et al.  Genetic structure of Hmong-Mien speaking populations in East Asia as revealed by mtDNA lineages. , 2005, Molecular biology and evolution.

[2]  R. Fisher 014: On the "Probable Error" of a Coefficient of Correlation Deduced from a Small Sample. , 1921 .

[3]  Marianthi Markatou,et al.  Quadratic distances on probabilities: A unified foundation , 2008, 0804.0991.

[4]  R. Thorne,et al.  Phenetic and Phylogenetic Classification , 1964, Nature.

[5]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[6]  K. Lange Reconstruction of Evolutionary Trees , 1997 .

[7]  D. Penny,et al.  The Use of Tree Comparison Metrics , 1985 .

[8]  R. Fisher,et al.  On the Mathematical Foundations of Theoretical Statistics , 1922 .

[9]  M. Nei,et al.  A Simple Method for Estimating and Testing Minimum-Evolution Trees , 1992 .

[10]  M. Nei,et al.  Molecular Evolution and Phylogenetics , 2000 .

[11]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[12]  Simon Whelan,et al.  Statistical Methods in Molecular Evolution , 2005 .

[13]  John P. Huelsenbeck,et al.  Bayesian Analysis of Molecular Evolution Using MrBayes , 2005 .

[14]  M. Nei,et al.  The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  M. Goodman,et al.  Maximum parsimony approach to construction of evolutionary trees from aligned homologous sequences. , 1990, Methods in enzymology.

[16]  M. Nei,et al.  MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. , 2007, Molecular biology and evolution.

[17]  R. Fisher 001: On an Absolute Criterion for Fitting Frequency Curves. , 1912 .

[18]  Bruce G. Lindsay,et al.  Building mixture trees from binary sequence data , 2006 .

[19]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[20]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[21]  M. Nei,et al.  Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. , 2000, Molecular biology and evolution.