Evolution and Functional Information

“Functional Information”—estimated from the mutual information of protein sequence alignments—has been proposed as a reliable way of estimating the number of proteins with a specified function and the consequent difficulty of evolving a new function. The fantastic rarity of functional proteins computed by this approach emboldens some to argue that evolution is impossible. Random searches, it seems, would have no hope of finding new functions. Here, we use simulations to demonstrate that sequence alignments are a poor estimate functional information. The mutual information of sequence alignments fantastically underestimates of the true number of functional proteins, because it also is strongly influenced by a family’s history, mutational bias, and selection. Regardless, even if functional information could be reliably calculated, it tells us nothing about the difficulty of evolving new functions, because it does not estimate the distance between a new function and existing functions. The pervasive observation of multifunctional proteins suggests that functions are actually ver close to one another and abundant. Multifunctional proteins would be impossible if the FI argument against evolution were true.

[1]  P. Anzenbacher,et al.  Cytochromes P450 and metabolism of xenobiotics , 2001, Cellular and Molecular Life Sciences CMLS.

[2]  On the causes of evolutionary transition:transversion bias , 2015, medRxiv.

[3]  Roderick Edwards,et al.  Theoretical Biology and Medical Modelling Open Access a Stochastic Model for Circadian Rhythms from Coupled Ultradian Oscillators , 2007 .

[4]  G. Singer,et al.  Nucleotide bias causes a genomewide bias in the amino acid composition of proteins. , 2000, Molecular biology and evolution.

[5]  Morris Swertz,et al.  Genome-wide patterns and properties of de novo mutations in humans , 2015, Nature Genetics.

[6]  A. Stoltzfus,et al.  On the Causes of Evolutionary Transition:Transversion Bias , 2015, bioRxiv.

[7]  N. Sueoka,et al.  CORRELATION BETWEEN BASE COMPOSITION OF DEOXYRIBONUCLEIC ACID AND AMINO ACID COMPOSITION OF PROTEIN. , 1961, Proceedings of the National Academy of Sciences of the United States of America.

[8]  P. Hanawalt,et al.  Mutational Strand Asymmetries in Cancer Genomes Reveal Mechanisms of DNA Damage and Repair , 2016, Cell.

[9]  Kin Chan,et al.  Clusters of Multiple Mutations: Incidence and Molecular Mechanisms. , 2015, Annual review of genetics.

[10]  Constance J Jeffery,et al.  An introduction to protein moonlighting. , 2014, Biochemical Society transactions.

[11]  Kirk K. Durston,et al.  Measuring the functional sequence complexity of proteins , 2007, Theoretical Biology and Medical Modelling.

[12]  T. Yomo,et al.  No stop codons in the antisense strands of the genes for nylon oligomer degradation. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Rick L. Stevens,et al.  Functional metagenomic profiling of nine biomes , 2008, Nature.

[14]  Fusheng Chen,et al.  Evolution of complete proteomes: guanine-cytosine pressure, phylogeny and environmental influences blend the proteomic architecture , 2013, BMC Evolutionary Biology.

[15]  Patrick L. Griffin,et al.  Functional information and the emergence of biocomplexity , 2007, Proceedings of the National Academy of Sciences.

[16]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[17]  Anushya Muruganujan,et al.  PANTHER version 10: expanded protein families and functions, and analysis tools , 2015, Nucleic Acids Res..

[18]  John H. White,et al.  How much of protein sequence space has been explored by life on Earth? , 2008, Journal of The Royal Society Interface.

[19]  Ying Wu,et al.  GC-Content of Synonymous Codons Profoundly Influences Amino Acid Usage , 2015, G3: Genes, Genomes, Genetics.

[20]  N. Matzke The Evolution of Creationist Movements , 2010, Evolution: Education and Outreach.

[21]  Xin Chen,et al.  A compression algorithm for DNA sequences and its applications in genome comparison , 2000, RECOMB '00.

[22]  Sanjay Joshua Swamidass,et al.  Improved Prediction of CYP-Mediated Metabolism with Chemical Fingerprints , 2015, J. Chem. Inf. Model..

[23]  H. Wilf,et al.  There’s plenty of time for evolution , 2010, Proceedings of the National Academy of Sciences.

[24]  N. Matzke The evolution of antievolution policies after Kitzmiller versus Dover , 2016, Science.

[25]  D. Nebert,et al.  The role of cytochrome P450 enzymes in endogenous signalling pathways and environmental carcinogenesis , 2006, Nature Reviews Cancer.

[26]  César A. Hidalgo,et al.  Proto-genes and de novo gene birth , 2012, Nature.

[27]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[28]  D. Tautz,et al.  The evolutionary origin of orphan genes , 2011, Nature Reviews Genetics.

[29]  S. Pääbo,et al.  A single splice site mutation in human-specific ARHGAP11B causes basal progenitor amplification , 2016, Science Advances.