Computational reconstruction of ancestral DNA sequences.

This chapter introduces the problem of ancestral sequence reconstruction: given a set of extant orthologous DNA genomic sequences (or even whole-genomes), together with a phylogenetic tree relating these sequences, predict the DNA sequence of all ancestral species in the tree. Blanchette et al. (1) have shown that for certain sets of species (in particular, for eutherian mammals), very accurate reconstruction can be obtained. We explain the main steps involved in this process, including multiple sequence alignment, insertion and deletion inference, substitution inference, and gene arrangement inference. We also describe a simulation-based procedure to assess the accuracy of the reconstructed sequences. The whole reconstruction process is illustrated using a set of mammalian sequences from the CFTR region.

[1]  D. Ord,et al.  PAUP:Phylogenetic analysis using parsi-mony , 1993 .

[2]  D. Haussler,et al.  Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[3]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[4]  D. Haussler,et al.  Counterexample to a claim about the reconstruction of ancestral character states. , 2005, Systematic Biology.

[5]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[6]  Folker Meyer,et al.  Generating Benchmarks for Multiple Sequence Alignments and Phylogenic Reconstructions , 1997, ISMB.

[7]  G. Moore,et al.  Molecular Evolution in the Descent of Man , 1971, Nature.

[8]  Lior Pachter,et al.  MAVID: constrained ancestral alignment of multiple sequences. , 2003, Genome research.

[9]  M. P. Cummings PHYLIP (Phylogeny Inference Package) , 2004 .

[10]  S. Batzoglou,et al.  Distribution and intensity of constraint in mammalian genomic sequence. , 2005, Genome research.

[11]  P. Pevzner,et al.  Genome-scale evolution: reconstructing gene orders in the ancestral species. , 2002, Genome research.

[12]  J. Jurka Repbase update: a database and an electronic journal of repetitive elements. , 2000, Trends in genetics : TIG.

[13]  S. O’Brien,et al.  Placental mammal diversification and the Cretaceous–Tertiary boundary , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  S. Batzoglou,et al.  Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. , 2003, Genome research.

[15]  D. Haussler,et al.  Reconstructing large regions of an ancestral mammalian genome in silico. , 2004, Genome research.

[16]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[17]  D. Maddison,et al.  The Tree of Life Web Project , 2007 .

[18]  D. Haussler,et al.  Article Identification and Characterization of Multi-Species Conserved Sequences , 2022 .

[19]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[20]  D. Haussler,et al.  Ultraconserved Elements in the Human Genome , 2004, Science.

[21]  David Haussler,et al.  Combining Phylogenetic and Hidden Markov Models in Biosequence Analysis , 2004, J. Comput. Biol..

[22]  L. L. Cam,et al.  Asymptotic Methods In Statistical Decision Theory , 1986 .

[23]  A. Monaco,et al.  Molecular evolution of FOXP2, a gene involved in speech and language , 2002, Nature.

[24]  Mathieu Blanchette,et al.  On the Inference of Parsimonious Indel Evolutionary Scenarios , 2006, J. Bioinform. Comput. Biol..

[25]  Nancy F. Hansen,et al.  Comparative analyses of multi-species sequences from targeted genomic regions , 2003, Nature.

[26]  S. O’Brien,et al.  Molecular dating and biogeography of the early placental mammal radiation. , 2001, The Journal of heredity.

[27]  Jotun Hein,et al.  A Large Version of the Small Parsimony Problem , 2003, WABI.

[28]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[29]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[30]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[31]  M. Nei,et al.  A new method of inference of ancestral nucleotide and amino acid sequences. , 1995, Genetics.

[32]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[33]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .