PhyloFinder: An intelligent search engine for phylogenetic tree databases

BackgroundBioinformatic tools are needed to store and access the rapidly growing phylogenetic data. These tools should enable users to identify existing phylogenetic trees containing a specified taxon or set of taxa and to compare a specified phylogenetic hypothesis to existing phylogenetic trees.ResultsPhyloFinder is an intelligent search engine for phylogenetic databases that we have implemented using trees from TreeBASE. It enables taxonomic queries, in which it identifies trees in the database containing the exact name of the query taxon and/or any synonymous taxon names, and it provides spelling suggestions for the query when there is no match. Additionally, PhyloFinder can identify trees containing descendants or direct ancestors of the query taxon. PhyloFinder also performs phylogenetic queries, in which it identifies trees that contain the query tree or topologies that are similar to the query tree.ConclusionPhyloFinder can enhance the utility of any tree database by providing tools for both taxonomic and phylogenetic queries as well as visualization tools that highlight the query results and provide links to NCBI and TBMap. An implementation of PhyloFinder using trees from TreeBASE is available from the web client application found in the availability and requirements section.

[1]  Roderic D. M. Page,et al.  Towards a Taxonomically Intelligent Phylogenetic Database , 2007 .

[2]  Sean R. Eddy,et al.  ATV: display and manipulation of annotated phylogenetic , 2001, Bioinform..

[3]  Roderic D. M. Page,et al.  Phyloinformatics: Toward a Phylogenetic Database , 2005, Data Mining in Bioinformatics.

[4]  Dennis Shasha,et al.  Fast Structural Search in Phylogenetic Databases , 2005, Evolutionary bioinformatics online.

[5]  Vadim Tropashko,et al.  Nested intervals tree encoding in SQL , 2005, SGMD.

[6]  Michael J. Sanderson,et al.  The Growth of Phylogenetic Information and the Need for a Phylogenetic Data Base , 1993 .

[7]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[8]  Roderic D. M. Page,et al.  A Taxonomic Search Engine: Federating taxonomic databases using web services , 2005, BMC Bioinformatics.

[9]  Daniel P. Miranker,et al.  Requirements of phylogenetic databases , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[10]  Dennis Shasha,et al.  TreeRank: a similarity measure for nearest neighbor searching in phylogenetic databases , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..

[11]  JUSTIN ZOBEL,et al.  Inverted files for text search engines , 2006, CSUR.

[12]  Mohammed J. Zaki Efficiently mining frequent trees in a forest , 2002, KDD.

[13]  D. Hillis,et al.  Analysis and visualization of tree space. , 2005, Systematic biology.

[14]  Michael J. Sanderson,et al.  The Small-world Dynamics of Tree Networks and Data Mining in Phyloinformatics , 2003, Bioinform..

[15]  Michael A. Bender,et al.  The LCA Problem Revisited , 2000, LATIN.

[16]  Dennis Shasha,et al.  A structure-based search engine for phylogenetic databases , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[17]  Joe Celko,et al.  Joe Celko's SQL for Smarties: Trees and Hierarchies , 2004 .

[18]  Roderic D. M. Page,et al.  TBMap: a taxonomic perspective on the phylogenetic database TreeBASE , 2007, BMC Bioinformatics.

[19]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[20]  Hannu Toivonen,et al.  Data Mining In Bioinformatics , 2005 .

[21]  Michael J. Sanderson,et al.  Paloverde: an OpenGL 3D phylogeny browser , 2006, Bioinform..