On the Need for Mechanistic Models in Computational Genomics and Metagenomics

Computational genomics is now generating very large volumes of data that have the potential to be used to address important questions in both basic biology and biomedicine. Addressing these important biological questions becomes possible when mechanistic models rooted in biochemistry and evolutionary/population genetic processes are developed, instead of fitting data to off-the-shelf statistical distributions that do not enable mechanistic inference. Three examples are presented, the first involving ecological processes inferred from metagenomic data, the second involving mechanisms of gene regulation rooted in protein–DNA interactions with consideration of DNA structure, and the third involving existing models for the retention of duplicate genes that enables prediction of evolutionary mechanisms. This description of mechanistic models is generalized toward future developments in computational genomics and the need for biological mechanisms and processes in biological models.

[1]  Kevin de Queiroz,et al.  Species Concepts and Species Delimitation , 2007 .

[2]  G. Wray The evolutionary significance of cis-regulatory mutations , 2007, Nature Reviews Genetics.

[3]  F. Rahel,et al.  An introduced and a native vertebrate hybridize to form a genetic bridge to a second native species , 2008, Proceedings of the National Academy of Sciences.

[4]  Arne Elofsson,et al.  Evaluating dosage compensation as a cause of duplicate gene retention in Paramecium tetraurelia , 2007, Genome Biology.

[5]  Clifford M. Hurvich,et al.  Bias of the corrected AIC criterion for underfitted regression and time series models , 1991 .

[6]  N. Goldman Simple diagnostic statistical tests of models for DNA substitution , 1993, Journal of Molecular Evolution.

[7]  John S. Conery,et al.  The evolutionary demography of duplicate genes , 2004, Journal of Structural and Functional Genomics.

[8]  Bengt Sennblad,et al.  The gene evolution model and computing its associated probabilities , 2009, JACM.

[9]  M. Nei Molecular Evolutionary Genetics , 1987 .

[10]  H. Philippe,et al.  Mechanistic revisions of phenomenological modeling strategies in molecular evolution. , 2010, Trends in genetics : TIG.

[11]  Johan A. Grahnen,et al.  Toward a General Model for the Evolutionary Dynamics of Gene Duplicates , 2011, Genome biology and evolution.

[12]  A. Drummond,et al.  Bayesian Inference of Species Trees from Multilocus Data , 2009, Molecular biology and evolution.

[13]  Kousha Etessami,et al.  Recursive Markov chains, stochastic grammars, and monotone systems of nonlinear equations , 2005, JACM.

[14]  Lucy J. Colwell,et al.  The interface of protein structure, protein biophysics, and molecular evolution , 2012, Protein science : a publication of the Protein Society.

[15]  Michael R. Hayden,et al.  Whole-Genome Sequencing: The New Standard of Care? , 2012, Science.

[16]  Peter J. Morin,et al.  Community Ecology: Morin/Community Ecology , 2011 .

[17]  Richard A. Goldstein,et al.  Estimating the Distribution of Selection Coefficients from Phylogenetic Data Using Sitewise Mutation-Selection Models , 2012, Genetics.

[18]  A. Force,et al.  The probability of preservation of a newly arisen gene duplicate. , 2001, Genetics.

[19]  M. Stephens,et al.  Bayesian statistical methods for genetic association studies , 2009, Nature Reviews Genetics.

[20]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[21]  Faisal Ababneh,et al.  Phylogenetic model evaluation. , 2008, Methods in molecular biology.

[22]  A. Fiorini,et al.  Intrinsically bent DNA in replication origins and gene promoters. , 2008, Genetics and molecular research : GMR.

[23]  B. Hausdorf PROGRESS TOWARD A GENERAL SPECIES CONCEPT , 2011, Evolution; international journal of organic evolution.

[24]  F. Kondrashov,et al.  The evolution of gene duplications: classifying and distinguishing between models , 2010, Nature Reviews Genetics.

[25]  D. Pearl,et al.  Species trees from gene trees: reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions. , 2007, Systematic biology.

[26]  J. McInerney,et al.  The public goods hypothesis for the evolution of life on Earth , 2011, Biology Direct.

[27]  D. Ussery,et al.  Defining the Pseudomonas Genus: Where Do We Draw the Line with Azotobacter? , 2011, Microbial Ecology.

[28]  I. Ruvinsky,et al.  Tempo and Mode in Evolution of Transcriptional Regulation , 2012, PLoS genetics.

[29]  Timothy Hughes,et al.  The Pattern of Evolution of Smaller-Scale Gene Duplicates in Mammalian Genomes is More Consistent with Neo- than Subfunctionalisation , 2007, Journal of Molecular Evolution.

[30]  R. Holt Predation, apparent competition, and the structure of prey communities. , 1977, Theoretical population biology.

[31]  Johan A. Grahnen,et al.  The Evolution of Protein Structures and Structural Ensembles Under Functional Constraint , 2011, Genes.

[32]  Johan A. Grahnen,et al.  Modeling Proteins at the Interface of Structure, Evolution, and Population Genetics , 2012 .

[33]  G. Marius Clore,et al.  Molecular Basis for Synergistic Transcriptional Activation by Oct1 and Sox2 Revealed from the Solution Structure of the 42-kDa Oct1·Sox2·Hoxb1-DNA Ternary Transcription Factor Complex* , 2004, Journal of Biological Chemistry.

[34]  T. Stadler On incomplete sampling under birth-death models and connections to the sampling-based coalescent. , 2009, Journal of theoretical biology.

[35]  D. Liberles,et al.  1 Understanding Gene Duplication Through Biochemistry and Population Genetics , 2011 .

[36]  M. Crossley,et al.  Homo- and heterodimerization in transcriptional regulation. , 2012, Advances in experimental medicine and biology.

[37]  K. de Queiroz,et al.  Species concepts and species delimitation. , 2007, Systematic biology.

[38]  D. Posada,et al.  Model selection and model averaging in phylogenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests. , 2004, Systematic biology.

[39]  A. Weber,et al.  Eukaryote-to-eukaryote gene transfer gives rise to genome mosaicism in euglenids , 2011, BMC Evolutionary Biology.

[40]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[41]  J. Lake,et al.  Horizontal gene transfer accelerates genome innovation and evolution. , 2003, Molecular biology and evolution.

[42]  M. Gelfand,et al.  Evolution of transcriptional regulation in closely related bacteria , 2012, BMC Evolutionary Biology.

[43]  George E. P. Box,et al.  Empirical Model‐Building and Response Surfaces , 1988 .

[44]  Lijiang Yang,et al.  Probing Allostery Through DNA , 2013, Science.

[45]  Noah A Rosenberg,et al.  Gene tree discordance, phylogenetic inference and the multispecies coalescent. , 2009, Trends in ecology & evolution.

[46]  G. Niemi,et al.  Community Ecology , 2013 .

[47]  Brian C. Thomas,et al.  Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity. , 2006, Genome research.

[48]  Marti J. Anderson,et al.  Species abundance distributions: moving beyond single prediction theories to integration within an ecological framework. , 2007, Ecology letters.

[49]  Liang Liu,et al.  BEST: Bayesian estimation of species trees under the coalescent model , 2008, Bioinform..

[50]  Lars Arvestad,et al.  Evolution after gene duplication: models, mechanisms, sequences, systems, and organisms. , 2007, Journal of experimental zoology. Part B, Molecular and developmental evolution.

[51]  Daniel Simberloff,et al.  The Assembly of Species Communities: Chance or Competition? , 1979 .

[52]  R. Ebright,et al.  Antibiotic Production by Myxobacteria Plays a Role in Predation , 2011, Journal of bacteriology.

[53]  Stephen Neidle,et al.  Principles of nucleic acid structure , 2007 .

[54]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[55]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[56]  A. Clark,et al.  Recent Explosive Human Population Growth Has Resulted in an Excess of Rare Genetic Variants , 2012, Science.