Lexical Landscapes as large in silico data for examining advanced properties of fitness landscapes

In silico approaches have served a central role in the development of evolutionary theory for generations. This especially applies to the concept of the fitness landscape, one of the most important abstractions in evolutionary genetics, and one which has benefited from the presence of large empirical data sets only in the last decade or so. In this study, we propose a method that allows us to generate enormous data sets that walk the line between in silico and empirical: word usage frequencies as catalogued by the Google ngram corpora. These data can be codified or analogized in terms of a multidimensional empirical fitness landscape towards the examination of advanced concepts—adaptive landscape by environment interactions, clonal competition, higher-order epistasis and countless others. We argue that the greater Lexical Landscapes approach can serve as a platform that offers an astronomical number of fitness landscapes for exploration (at least) or theoretical formalism (potentially) in evolutionary biology.

[1]  Nigel F. Delaney,et al.  Darwinian Evolution Can Follow Only Very Few Mutational Paths to Fitter Proteins , 2006, Science.

[2]  Slav Petrov,et al.  Syntactic Annotations for the Google Books NGram Corpus , 2012, ACL.

[3]  J. Plotkin,et al.  Inferring the shape of global epistasis , 2018, Proceedings of the National Academy of Sciences.

[4]  Michael Baym,et al.  Delayed commitment to evolutionary fate in antibiotic resistance fitness landscapes , 2015, Nature Communications.

[5]  Tim F. Cooper,et al.  The Environment Affects Epistatic Interactions to Alter the Topology of an Empirical Fitness Landscape , 2013, PLoS genetics.

[6]  Alexander G. Fletcher,et al.  Steering Evolution with Sequential Therapy to Prevent the Emergence of Bacterial Antibiotic Resistance , 2015, PLoS Comput. Biol..

[7]  Ben Lehner,et al.  Combinatorial Genetics Reveals a Scaling Law for the Effects of Mutations on Splicing , 2019, Cell.

[8]  C. Brandon Ogbunugafor,et al.  Adaptive Landscape by Environment Interactions Dictate Evolutionary Dynamics in Models of Drug Resistance , 2016, PLoS Comput. Biol..

[9]  M. Nowak,et al.  Stochastic Tunnels in Evolutionary Dynamics , 2004, Genetics.

[10]  Timothy B Sackton,et al.  Genotypic Context and Epistasis in Individuals and Populations , 2016, Cell.

[11]  Robert B. Heckendorn,et al.  Should evolutionary geneticists worry about higher-order epistasis? , 2013, Current opinion in genetics & development.

[12]  Robert B. Heckendorn,et al.  The Influence of Higher-Order Epistasis on Biological Fitness Landscape Topography , 2017, bioRxiv.

[13]  C. Brandon Ogbunugafor,et al.  A New Take on John Maynard Smith's Concept of Protein Space for Understanding Molecular Evolution , 2016, PLoS Comput. Biol..

[14]  J. Krug,et al.  Empirical fitness landscapes and the predictability of evolution , 2014, Nature Reviews Genetics.

[15]  Sayan Mukherjee,et al.  Detecting epistasis with the marginal epistasis test in genetic mapping studies of quantitative traits , 2016, bioRxiv.

[16]  C Brandon Ogbunugafor,et al.  Proteostasis Environment Shapes Higher-Order Epistasis Operating on Antibiotic Resistance , 2018, Genetics.

[17]  G. Achaz,et al.  MAGELLAN: a tool to explore small fitness landscapes , 2015, bioRxiv.

[18]  C. Ofria,et al.  Evolution of digital organisms at high mutation rates leads to survival of the flattest , 2001, Nature.

[19]  John Maynard Smith,et al.  Natural Selection and the Concept of a Protein Space , 1970, Nature.

[20]  Margaret J. Eppstein,et al.  Competition along trajectories governs adaptation rates towards antimicrobial resistance , 2016, Nature Ecology &Evolution.

[21]  Roy Kishony,et al.  Understanding, predicting and manipulating the genotypic evolution of antibiotic resistance , 2013, Nature Reviews Genetics.

[22]  Frank J. Poelwijk,et al.  The Context-Dependence of Mutations: A Linkage of Formalisms , 2015, PLoS Comput. Biol..

[23]  Joshua L. Payne,et al.  A thousand empirical adaptive landscapes and their navigability , 2017, Nature Ecology &Evolution.

[24]  Thanat Chookajorn,et al.  Stepwise acquisition of pyrimethamine resistance in the malaria parasite , 2009, Proceedings of the National Academy of Sciences.

[25]  N. Barton Fitness Landscapes and the Origin of Species , 2004 .

[26]  Erez Lieberman Aiden,et al.  Quantitative Analysis of Culture Using Millions of Digitized Books , 2010, Science.