SoTree: An Automated Phylogeny Assembly Tool for Ecologists from Big Tree

Entire areas of study, such as community phylogenetics and comparative analysis, require detailed phylogenetic, as well as ecological information. China has 31,362 species of vascular plants belonging to 3,328 genera and 312 families, which is ranked in the top six megadiverse countries of the world. along with the big tree of vascular plants in China has been completed recently, SoTree, an automated phylogeny assembly tool is being implemented for ecologists. This paper presents SoTree to the big tree of vascular plants in China for ecologists to generate a phylogeny by search species from an excellent genus-level reference phylogeny on the basis of taxonomy. Meanwhile, the algorithm and solution of SoTree is also described which is based on several steps including parsing, reconstructing and storing of the Basic big phylogenetic tree, standardization of species sub-list, retrieving each node element in the chain, construction of group relation of node elements, calculation weight of each element, assembly of output phylogenetic tree and visualization of this phylogenetic tree in a web interactive environment. SoTree is an online phylogenetic query tool where users submit a list of taxa (e.g. from an ecological community), with genus and species names, get a phylogenetic hypothesis for the relationships among taxa and then can do visualization for trees (phylogenetic tree, time tree or both). Several input ways can be used and several output formats for tree can be selected, while any name will be marked at the accepted name or synonym according to Flora of China (FOC). Therfore, Ecologists can get accurate tree data and save a lot of time and effort. Currently, the source databases cover vascular plants. SoTree is prepared at the URL: http://www.darwintree.cn/flora/index.shtml.

[1]  Duhong Chen,et al.  The PhyLoTA Browser: processing GenBank for molecular phylogenetics research. , 2008, Systematic biology.

[2]  Sugong Wu,et al.  Floristic characteristics and diversity of East Asian plants : proceedings of the first International Symposium on Floristic Characteristics and Diversity of East Asian Plants, July 25-27, 1996, Kunming, Yunnan, P.R. China , 1998 .

[3]  Yuanchun Zhou,et al.  Darwintree: A Molecular Data Analysis and Application Environment for Phylogenetic Study , 2015, Data Sci. J..

[4]  HONG De-Yuan,et al.  Plants of China : A companion to the Flora of China , 2015 .

[5]  Mark A. Miller,et al.  Creating the CIPRES Science Gateway for inference of large phylogenetic trees , 2010, 2010 Gateway Computing Environments Workshop (GCE).

[6]  Alexandros Stamatakis,et al.  Algorithms, data structures, and numerics for likelihood-based phylogenetic inference of huge trees , 2011, BMC Bioinformatics.

[7]  Andy Purvis,et al.  phyloGenerator: an automated phylogeny generation tool for ecologists , 2013 .

[8]  Campbell O. Webb,et al.  Phylogenies and Community Ecology , 2002 .

[9]  Avi Pfeffer,et al.  Automatic genome-wide reconstruction of phylogenetic gene trees , 2007, ISMB/ECCB.

[10]  M. Sanderson,et al.  Phylogenetic supermatrix analysis of GenBank sequences from 2228 papilionoid legumes. , 2006, Systematic biology.

[11]  Susan Kelley,et al.  Flora of China , 2008 .

[12]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[13]  R. Ricklefs,et al.  Large-scale processes and the Asian bias in species diversity of temperate plants , 2000, Nature.

[14]  M. Donoghue,et al.  Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches , 2009, BMC Evolutionary Biology.

[15]  Guoqing Lu,et al.  A practical approach to phylogenomics: the phylogeny of ray-finned fish (Actinopterygii) as a case study , 2007, BMC Evolutionary Biology.

[16]  Guo-Qiang Zhang,et al.  Tree of life for the genera of Chinese vascular plants , 2016 .

[17]  Kate E. Jones,et al.  The delayed rise of present-day mammals , 1990, Nature.

[18]  A. Holmes,et al.  Physical Geography , 2019, Nature.

[19]  B. Snel,et al.  Toward Automatic Reconstruction of a Highly Resolved Tree of Life , 2006, Science.

[20]  Yong Li,et al.  Construction of the Platform for Phylogenetic Analysis , 2011 .

[21]  J. Cavender-Bares,et al.  The merging of community ecology and phylogenetic biology. , 2009, Ecology letters.

[22]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[23]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[24]  Christian M. Zmasek,et al.  phyloXML: XML for evolutionary biology and comparative genomics , 2009, BMC Bioinformatics.

[25]  O. Gascuel,et al.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. , 2010, Systematic biology.