LSHPlace: Fast Phylogenetic Placement Using Locality-Sensitive Hashing

We consider the problem of phylogenetic placement, in which large numbers of sequences (often next-generation sequencing reads) are placed onto an existing phylogenetic tree. We adapt our recent work on phylogenetic tree inference, which uses ancestral sequence reconstruction and locality-sensitive hashing, to this domain. With these ideas, new sequences can be placed onto trees with high fidelity in strikingly fast runtimes. Our results are two orders of magnitude faster than existing programs for this domain, and show a modest accuracy tradeoff. Our results offer the possibility of analyzing many more reads in a next-generation sequencing project than is currently possible.

[1]  Daniel G. Brown,et al.  Towards a Practical O(n logn) Phylogeny Algorithm , 2011, WABI.

[2]  Wolfgang Gerlach,et al.  Taxonomic classification of metagenomic sequences , 2012 .

[3]  Genbank,et al.  APPLIED AND ENVIRONMENTAL MICROBIOLOGY , 2008, Applied and Environmental Microbiology.

[4]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[5]  P. S. Dwyer Annals of Applied Probability , 2006 .

[6]  BMC Bioinformatics , 2005 .

[7]  Daniel G. Brown,et al.  Fast Phylogenetic Tree Reconstruction Using Locality-Sensitive Hashing , 2012, WABI.

[8]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[9]  J. Mattick,et al.  Genome research , 1990, Nature.

[10]  Alexandros Stamatakis,et al.  Phylogenetic models of rate heterogeneity: a high performance computing perspective , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[11]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[12]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[13]  Victor H Hernandez,et al.  Nature Methods , 2007 .

[14]  M. Steel Distributions on bicoloured evolutionary trees , 1990, Bulletin of the Australian Mathematical Society.