Characterization of phylogenetic networks with NetTest

BackgroundTypical evolutionary events like recombination, hybridization or gene transfer make necessary the use of phylogenetic networks to properly depict the evolution of DNA and protein sequences. Although several theoretical classes have been proposed to characterize these networks, they make stringent assumptions that will likely not be met by the evolutionary process. We have recently shown that the complexity of simulated networks is a function of the population recombination rate, and that at moderate and large recombination rates the resulting networks cannot be categorized. However, we do not know whether these results extend to networks estimated from real data.ResultsWe introduce a web server for the categorization of explicit phylogenetic networks, including the most relevant theoretical classes developed so far. Using this tool, we analyzed statistical parsimony phylogenetic networks estimated from ~5,000 DNA alignments, obtained from the NCBI PopSet and Polymorphix databases. The level of characterization was correlated to nucleotide diversity, and a high proportion of the networks derived from these data sets could be formally characterized.ConclusionsWe have developed a public web server, NetTest (freely available from the software section at http://darwin.uvigo.es), to formally characterize the complexity of phylogenetic networks. Using NetTest we found that most statistical parsimony networks estimated with the program TCS could be assigned to a known network class. The level of network characterization was correlated to nucleotide diversity and dependent upon the intra/interspecific levels, although no significant differences were detected among genes. More research on the properties of phylogenetic networks is clearly needed.

[1]  Dan Gusfield,et al.  Optimal, Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination , 2004, J. Bioinform. Comput. Biol..

[2]  L. Nakhleh,et al.  A Metric on the Space of Reduced Phylogenetic Networks , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Bernard M. E. Moret,et al.  NetGen: generating phylogenetic networks with diploid hybrids , 2006, Bioinform..

[4]  Gabriel Cardona,et al.  An algebraic metric for phylogenetic trees , 2009, Appl. Math. Lett..

[5]  Víctor Soria-Carrasco,et al.  The K tree score: quantification of differences in the relative branch length and topology of phylogenetic trees , 2007, Bioinform..

[6]  Kazutaka Katoh,et al.  Recent developments in the MAFFT multiple sequence alignment program , 2008, Briefings Bioinform..

[7]  Gabriel Cardona,et al.  A perl package and an alignment tool for phylogenetic networks , 2007, BMC Bioinformatics.

[8]  Gabriel Cardona,et al.  On Nakhleh's Metric for Reduced Phylogenetic Networks , 2009, TCBB.

[9]  Luay Nakhleh,et al.  PhyloNet: a software package for analyzing and reconstructing reticulate evolutionary relationships , 2008, BMC Bioinformatics.

[10]  K. Crandall,et al.  Intraspecific gene genealogies: trees grafting into networks. , 2001, Trends in ecology & evolution.

[11]  D. Posada,et al.  Characterization of Reticulate Networks Based on the Coalescent with Recombination , 2008, Molecular biology and evolution.

[12]  Gabriel Cardona,et al.  A distance metric for a class of tree-sibling phylogenetic networks , 2008, Bioinform..

[13]  Gabriel Cardona,et al.  Tripartitions do not always discriminate phylogenetic networks , 2008, Mathematical biosciences.

[14]  Dan Gusfield,et al.  The Fine Structure of Galls in Phylogenetic Networks , 2004, INFORMS J. Comput..

[15]  Kunihiko Sadakane,et al.  Computing the Maximum Agreement of Phylogenetic Networks , 2004, CATS.

[16]  J. Felsenstein,et al.  A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. , 1994, Molecular biology and evolution.

[17]  C. Sing,et al.  A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. , 1992, Genetics.

[18]  Simon Penel,et al.  Polymorphix: a sequence polymorphism database , 2004, Nucleic Acids Res..

[19]  Ryuhei Uehara,et al.  Faster computation of the Robinson-Foulds distance between phylogenetic networks , 2012, Inf. Sci..

[20]  J. Hein,et al.  Consequences of recombination on traditional phylogenetic analysis. , 2000, Genetics.

[21]  Daniel H. Huson,et al.  SplitsTree: analyzing and visualizing evolutionary data , 1998, Bioinform..

[22]  Francesc Rosselló,et al.  All that Glisters is not Galled , 2009, Mathematical biosciences.

[23]  A. Templeton,et al.  Root probabilities for intraspecific gene trees under neutral coalescent theory. , 1994, Molecular phylogenetics and evolution.

[24]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[25]  Gabriel Cardona,et al.  Extended Newick: it is time for a standard representation of phylogenetic networks , 2008, BMC Bioinformatics.

[26]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[27]  D. Huson,et al.  Application of phylogenetic networks in evolutionary studies. , 2006, Molecular biology and evolution.

[28]  David S. Johnson,et al.  Network Flows and Matching: First DIMACS Implementation Challenge , 1993 .

[29]  K. Crandall,et al.  TCS: a computer program to estimate gene genealogies , 2000, Molecular ecology.

[30]  G. Valiente,et al.  Metrics for Phylogenetic Networks I: Generalizations of the Robinson-Foulds Metric , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[31]  K. Crandall,et al.  The Effect of Recombination on the Accuracy of Phylogeny Estimation , 2002, Journal of Molecular Evolution.

[32]  J. Hein,et al.  Recombination, balancing selection and phylogenies in MHC and self-incompatibility genes. , 2001, Genetics.