Translating DNA data tables into quasi-median networks for parsimony analysis and error detection.

Every DNA data table can be turned into a quasi-median network that faithfully represents the data. We show that for (weighted) condensed data tables the associated network harbors all most parsimonious reconstructions for any tree that connects the sampled haplotypes. Structural features of this network can be computed directly from the data table. The key principle repeatedly used is that the quasi-median network is uniquely determined by the sub-tables for pairs of characters. The translation of a table into a network enhances the understanding of the properties of the data in regard to homoplasy and potential artifacts. The total number of nodes of such a network measures the complexity of the data. In particular, networks that display the results of filter analyses by which hotspot mutations are removed help to detect data idiosyncrasies and thus pinpoint sequencing problems. A pertinent example drawn from human mtDNA illustrates these points.

[1]  Bruce D. Smith,et al.  Documenting domestication: the intersection of genetics and archaeology. , 2006, Trends in genetics : TIG.

[2]  T. Parsons,et al.  Mitochondrial DNA control region sequences from Nairobi (Kenya): inferring phylogenetic parameters for the establishment of a forensic database , 2004, International Journal of Legal Medicine.

[3]  A. Di Rienzo,et al.  Tracing European founder lineages in the Near Eastern mtDNA pool. , 2000, American journal of human genetics.

[4]  H. Bandelt,et al.  Median-joining networks for inferring intraspecific phylogenies. , 1999, Molecular biology and evolution.

[5]  H. Bandelt,et al.  The fingerprint of phantom mutations in mitochondrial DNA data. , 2002, American journal of human genetics.

[6]  Anita Brandstätter,et al.  Generating population data for the EMPOP database - an overview of the mtDNA sequencing and data evaluation processes considering 273 Austrian control region sequences as example. , 2007, Forensic science international.

[7]  Allan C. Wilson,et al.  Mitochondrial DNA sequences of primates: Tempo and mode of evolution , 2005, Journal of Molecular Evolution.

[8]  M. Stoneking,et al.  Mitochondrial DNA variation and language replacements in the Caucasus , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[9]  Q. Kong,et al.  The dazzling array of basal branches in the mtDNA macrohaplogroup M from India as inferred from complete genomes. , 2006, Molecular biology and evolution.

[10]  W. Imrich,et al.  Product Graphs: Structure and Recognition , 2000 .

[11]  H. M. Mulder The interval function of a graph , 1980 .

[12]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[13]  H. Bandelt,et al.  Human Mitochondrial DNA and the Evolution of Homo sapiens , 2006 .

[14]  Mark R. Wilson,et al.  The mtDNA Population Database: An Integrated Software and Database Resource for Forensic Comparison , 2002 .

[15]  D. Turnbull,et al.  Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA , 1999, Nature Genetics.

[16]  R. Villems,et al.  Ethiopian mitochondrial DNA heritage: tracking gene flow across and around the gate of tears. , 2004, American journal of human genetics.

[17]  Peter Wiegand,et al.  Application of a quasi-median network analysis for the visualization of character conflicts to a population sample of mitochondrial DNA control region sequences from southern Germany (Ulm) , 2006, International Journal of Legal Medicine.

[18]  Hans-Jürgen Bandelt,et al.  Phylogeny of mitochondrial DNA macrohaplogroup N in India, based on complete sequencing: implications for the peopling of South Asia. , 2004, American journal of human genetics.

[19]  R Trivedi,et al.  Phylogeny and antiquity of M macrohaplogroup inferred from complete mt DNA sequence of Indian specific lineages , 2005, BMC Evolutionary Biology.

[20]  George M. Bergman,et al.  On the existence of subalgebras of direct products with prescribedd-fold projections , 1977 .

[21]  F. Bakker,et al.  Plant Species-level Systematics: New Perspectives on Pattern & Process , 2005 .

[22]  Gordon Luikart,et al.  DNA markers reveal the complexity of livestock domestication , 2003, Nature Reviews Genetics.

[23]  Max Ingman,et al.  mtDB: Human Mitochondrial Genome Database, a resource for population genetics and medical sciences , 2005, Nucleic Acids Res..

[24]  R. Villems,et al.  Lab-Specific Mutation Processes , 2006 .

[25]  Hans-Jürgen Bandelt,et al.  Quasi-median graphs from sets of partitions , 2002, Discret. Appl. Math..

[26]  Hans-Jürgen Bandelt,et al.  Harvesting the fruit of the human mtDNA tree. , 2006, Trends in genetics : TIG.

[27]  Hans-Jürgen Bandelt,et al.  Median algebras , 1983, Discret. Math..

[28]  H. Bandelt,et al.  Median networks: speedy construction and greedy reduction, one simulation, and two case studies from human mtDNA. , 2000, Molecular phylogenetics and evolution.

[29]  Hans-Jürgen Bandelt,et al.  Phantom mutation hotspots in human mitochondrial DNA , 2005, Electrophoresis.

[30]  Q. Kong,et al.  Estimation of Mutation Rates and Coalescence Times: Some Caveats , 2006 .

[31]  M. Stoneking,et al.  Mitochondrial DNA and Y‐Chromosome Variation in the Caucasus , 2004, Annals of human genetics.

[32]  T. Parsons,et al.  Toward increased utility of mtDNA in forensic identifications. , 2004, Forensic science international.

[33]  Hans-Jürgen Bandelt,et al.  Graphs of Acyclic Cubical Complexes , 1996, Eur. J. Comb..

[34]  F. Delsuc,et al.  Phylogenomics: the beginning of incongruence? , 2006, Trends in genetics : TIG.

[35]  M. Metspalu,et al.  The World mtDNA Phylogeny , 2006 .

[36]  M. Nei Molecular Evolutionary Genetics , 1987 .

[37]  T. Kivisild,et al.  Quality Assessment of DNA Sequence Data: Autopsy of A Mis‐Sequenced mtDNA Population Sample , 2006, Annals of human genetics.