Consistency of Bayesian inference of resolved phylogenetic trees.

Bayesian inference is now a leading technique for reconstructing phylogenetic trees from aligned sequence data. In this short note, we formally show that the maximum posterior tree topology provides a statistically consistent estimate of a fully resolved evolutionary tree under a wide variety of conditions. This includes the inference of gene trees from aligned sequence data across the entire parameter range of branch lengths, and under general conditions on priors in models where the usual 'identifiability' conditions hold. We extend this to the inference of species trees from sequence data, where the gene trees constitute 'nuisance parameters', as in the program (*)BEAST. This note also addresses earlier concerns raised in the literature questioning the extent to which statistical consistency for Bayesian methods might hold in general.

[1]  László A. Székely,et al.  Teasing Apart Two Trees , 2007, Combinatorics, Probability and Computing.

[2]  A. Drummond,et al.  Bayesian Inference of Species Trees from Multilocus Data , 2009, Molecular biology and evolution.

[3]  Joseph T. Chang,et al.  Full reconstruction of Markov models on evolutionary trees: identifiability and consistency. , 1996, Mathematical biosciences.

[4]  E. Hill Journal of Theoretical Biology , 1961, Nature.

[5]  Noah A Rosenberg,et al.  Gene tree discordance, phylogenetic inference and the multispecies coalescent. , 2009, Trends in ecology & evolution.

[6]  Anne-Mieke Vandamme,et al.  The Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing , 2009 .

[7]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[8]  Mike Steel,et al.  The Bayesian "star paradox" persists for long finite sequences. , 2006, Molecular biology and evolution.

[9]  A. Haeseler,et al.  The Phylogenetic Handbook , 2011 .

[10]  David J. Aldous,et al.  Lower bounds for covering times for reversible Markov chains and random walks on graphs , 1989 .

[11]  Edward Susko,et al.  On the distributions of bootstrap support and posterior distributions for a star tree. , 2008, Systematic biology.

[12]  Ziheng Yang,et al.  Fair-balance paradox, star-tree paradox, and Bayesian phylogenetics. , 2007, Molecular biology and evolution.

[13]  Fan Jing,et al.  Appendix Proof of Lemma 1 : , 2013 .

[14]  László A. Székely,et al.  Inverting random functions , 1999 .

[15]  Bryan Kolaczkowski,et al.  Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics , 2009, PloS one.

[16]  Eric Vigoda,et al.  Phylogenetic MCMC Algorithms Are Misleading on Mixtures of Trees , 2005, Science.

[17]  László A. Székely,et al.  Inverting Random Functions III: Discrete MLE Revisited , 2006 .