Genetic and linguistic histories in Central Asia inferred using approximate Bayesian computations

Linguistic and genetic data have been widely compared, but the histories underlying these descriptions are rarely jointly inferred. We developed a unique methodological framework for analysing jointly language diversity and genetic polymorphism data, to infer the past history of separation, exchange and admixture events among human populations. This method relies on approximate Bayesian computations that enable the identification of the most probable historical scenario underlying each type of data, and to infer the parameters of these scenarios. For this purpose, we developed a new computer program PopLingSim that simulates the evolution of linguistic diversity, which we coupled with an existing coalescent-based genetic simulation program, to simulate both linguistic and genetic data within a set of populations. Applying this new program to a wide linguistic and genetic dataset of Central Asia, we found several differences between linguistic and genetic histories. In particular, we showed how genetic and linguistic exchanges differed in the past in this area: some cultural exchanges were maintained without genetic exchanges. The methodological framework and the linguistic simulation tool developed here can be used in future work for disentangling complex linguistic and genetic evolutions underlying human biological and cultural histories.

[1]  L. Cavalli-Sforza,et al.  Coevolution of genes and languages revisited. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[2]  H. Mader,et al.  The constitutive equation and flow dynamics of bubbly magmas , 2002 .

[3]  F. d’Errico,et al.  Becoming eloquent : advances in the emergence of language, human cognition, and modern cultures , 2009 .

[4]  Simon J. Greenhill,et al.  Mapping the Origins and Expansion of the Indo-European Language Family , 2012, Science.

[5]  C. Brenninkmeijer,et al.  Oxygen isotope composition of stratospheric carbon dioxide , 2002 .

[6]  Suling Zhu,et al.  The genetic legacy of the Mongols. , 2003, American journal of human genetics.

[7]  M. Pagel Human language as a culturally transmitted replicator , 2009, Nature Reviews Genetics.

[8]  K. Laland,et al.  Towards a unified science of cultural evolution. , 2006, The Behavioral and brain sciences.

[9]  S. Bahuchet Changing Language, Remaining Pygmy , 2012, Human biology.

[10]  James Steele,et al.  Language trees ≠ gene trees , 2010, Theory in Biosciences.

[11]  C. Renfrew,et al.  Archaeology and Language: The Puzzle of Indo-European Origins , 1988, American Antiquity.

[12]  J. Nerbonne,et al.  A Central Asian Language Survey , 2016 .

[13]  M. Bortolini,et al.  A Bayesian Approach to Genome/Linguistic Relationships in Native South Americans , 2013, PloS one.

[14]  Sohini Ramachandran,et al.  Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Yun S. Song,et al.  The Simons Genome Diversity Project: 300 genomes from 142 diverse populations , 2016, Nature.

[16]  Jean-Marie Hombert,et al.  Origins and Genetic Diversity of Pygmy Hunter-Gatherers from Western Central Africa , 2009, Current Biology.

[17]  P. Balaresque,et al.  Genetic diversity and the emergence of ethnic groups in Central Asia , 2009 .

[18]  A Piazza,et al.  Reconstruction of human evolution: bringing together genetic, archaeological, and linguistic data. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Laurent Excoffier,et al.  Fastsimcoal: a Continuous-time Coalescent Simulator of Genomic Diversity under Arbitrarily Complex Evolutionary Scenarios , 2011, Bioinform..

[20]  Uri Tadmor,et al.  Loanwords in the World's Languages: A Comparative Handbook , 2009 .

[21]  C. J-F,et al.  THE COALESCENT , 1980 .

[22]  B. Weir,et al.  ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE , 1984, Evolution; international journal of organic evolution.

[23]  Richard Durbin,et al.  Inferring human population size and separation history from multiple genome sequences , 2014 .

[24]  L. Excoffier,et al.  Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows , 2010, Molecular ecology resources.

[25]  G. Nicholls,et al.  FROM WORDS TO DATES: WATER INTO WINE, MATHEMAGIC OR PHYLOGENETIC INFERENCE? , 2005 .

[26]  Einar Haugen,et al.  The analysis of linguistic borrowing. , 1950 .

[27]  R. Gray,et al.  Language-tree divergence times support the Anatolian theory of Indo-European origin , 2003, Nature.

[28]  D. Falush,et al.  A Genetic Atlas of Human Admixture History , 2014, Science.

[29]  M. Feldman,et al.  Cultural transmission and evolution: a quantitative approach. , 1981, Monographs in population biology.

[30]  J. Long The genetic structure of admixed populations. , 1991, Genetics.

[31]  O. François,et al.  Approximate Bayesian Computation (ABC) in practice. , 2010, Trends in ecology & evolution.

[32]  Quentin D Atkinson,et al.  Curious parallels and curious connections--phylogenetic thinking in biology and historical linguistics. , 2005, Systematic biology.

[33]  John C. Avise,et al.  Working Toward a Synthesis of Archaeological, Linguistic, and Genetic Data for Inferring African Population History , 2010 .

[34]  Laure Ségurel,et al.  In the heartland of Eurasia: the multilocus genetic landscape of Central Asian populations , 2011, European Journal of Human Genetics.

[35]  C. Tyler-Smith,et al.  Genetic evidence for an origin of the Armenians from Bronze Age mixing of multiple populations , 2015, European Journal of Human Genetics.

[36]  P. Donnelly,et al.  Inferring coalescence times from DNA sequence data. , 1997, Genetics.

[37]  Tao Gong,et al.  Exploring the Roles of Horizontal, Vertical, and Oblique Transmissions in Language Evolution , 2010, Adapt. Behav..

[38]  E. Heyer,et al.  Statistical inference on genetic data reveals the complex demographic history of human populations in central Asia. , 2015, Molecular biology and evolution.

[39]  O Gascuel,et al.  BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. , 1997, Molecular biology and evolution.

[40]  Sean S. Downey,et al.  Coevolution of languages and genes on the island of Sumba, eastern Indonesia , 2007, Proceedings of the National Academy of Sciences.

[41]  Laure Ségurel,et al.  Microsatellite data show recent demographic expansions in sedentary but not in nomadic human populations in Africa and Eurasia , 2014, European Journal of Human Genetics.

[42]  N. Mantel The detection of disease clustering and a generalized regression approach. , 1967, Cancer research.

[43]  Arnaud Estoup,et al.  Geneland: a computer package for landscape genetics , 2005 .

[44]  M. Feldman,et al.  The application of molecular genetic approaches to the study of human evolution , 2003, Nature Genetics.

[45]  R R Sokal,et al.  Zones of sharp genetic change in Europe are also linguistic boundaries. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Jonathan Scott Friedlaender,et al.  Genetic and Linguistic Coevolution in Northern Island Melanesia , 2008, PLoS genetics.

[47]  A. von Haeseler,et al.  Inference of population history using a likelihood approach. , 1998, Genetics.

[48]  S. Soucek A History of Inner Asia , 2000 .

[49]  Claire Bowern,et al.  Rejection of a serial founder effects model of genetic and linguistic coevolution , 2012, Proceedings of the Royal Society B: Biological Sciences.

[50]  Jake K. Byrnes,et al.  Reconstructing the Population Genetic History of the Caribbean , 2013, PLoS genetics.

[51]  M. Feldman,et al.  A comparison of worldwide phonemic and genetic variation in human populations , 2015, Proceedings of the National Academy of Sciences.

[52]  J. Graves The descent of man , 2004, Nature.

[53]  M. Swadesh Lexico-Statistical Dating of Prehistoric Ethnic Contacts , 1952 .

[54]  Simon J. Greenhill,et al.  Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement , 2009, Science.

[55]  Simon J. Greenhill,et al.  Languages Evolve in Punctuational Bursts , 2008, Science.

[56]  Simon J. Greenhill,et al.  Does horizontal transmission invalidate cultural phylogenies? , 2009, Proceedings of the Royal Society B: Biological Sciences.

[57]  Patrick McConvell,et al.  Loanwords in Gurindji, a Pama-Nyungan language of Australia , 2009 .

[58]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[59]  Jean-Baptiste André,et al.  The Transmission of Genes and Culture: A Questionable Analogy , 2012, Evolutionary Biology.

[60]  Nicolas Ray,et al.  Long-Distance Dispersal Shaped Patterns of Human Genetic Diversity in Eurasia , 2015, Molecular biology and evolution.

[61]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..