The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics

Phylogenetic methods have revolutionised evolutionary biology and have recently been applied to studies of linguistic and cultural evolution. However, the basic comparative data on the languages of the world required for these analyses is often widely dispersed in hard to obtain sources. Here we outline how our Austronesian Basic Vocabulary Database (ABVD) helps remedy this situation by collating wordlists from over 500 languages into one web-accessible database. We describe the technology underlying the ABVD and discuss the benefits that an evolutionary bioinformatic approach can provide. These include facilitating computational comparative linguistic research, answering questions about human prehistory, enabling syntheses with genetic data, and safe-guarding fragile linguistic information.

[1]  M. Pagel,et al.  Frequency of word-use predicts rates of lexical evolution throughout Indo-European history , 2007, Nature.

[2]  R. Blust,et al.  The prehistory of the Austronesian-speaking peoples: A view from language , 1995 .

[3]  Darrell T. Tryon,et al.  Solomon Islands languages : an internal classification , 1983 .

[4]  Steven Mithen,et al.  Examining the farming/language dispersal hypothesis , 2004 .

[5]  R. J. Mitchell,et al.  The Genographic Project Public Participation Mitochondrial DNA Database , 2007, PLoS Genetics.

[6]  David Gil,et al.  The World Atlas of Language Structures , 2005 .

[7]  Paul Proulx,et al.  Time depth in historical linguistics , 2004 .

[8]  S T Sherry,et al.  Evolutionary history of the COII/tRNALys intergenic 9 base pair deletion in human mitochondrial DNAs from the Pacific. , 1995, Molecular biology and evolution.

[9]  Simon J. Greenhill,et al.  The Pleasures and Perils of Darwinizing Culture (with Phylogenies) , 2007 .

[10]  Simon J. Greenhill,et al.  Testing population dispersal hypotheses : Pacific settlement, phylogenetic trees, and Austronesian languages , 2005 .

[11]  L B.M. Ellis,et al.  Molecular biology databases: today and tomorrow. , 2001, Drug discovery today.

[12]  Patrick V. Kirch,et al.  Hawaiki, Ancestral Polynesia: An Essay in Historical Anthropology , 2001 .

[13]  Mark Taber,et al.  Toward a better understanding of the indigenous languages of southwestern Maluku : Papers on languages of Maluku , 1993 .

[14]  Daniel Nettle,et al.  Genetic and Linguistic Affinities between Human Populations in Eurasia and West Africa , 2003, Human biology.

[15]  S. Oppenheimer,et al.  Fast Trains, Slow Boats, and the Ancestry of the Polynesian Islanders , 2001, Science progress.

[16]  Michael Y. Galperin The Molecular Biology Database Collection: 2008 update , 2007, Nucleic Acids Res..

[17]  Ho-min Sohn,et al.  Proto-Micronesian Reconstructions--I , 2003 .

[18]  Keith F. Otterbein,et al.  The Comparative Method in Anthropology [and Comments and Reply] , 1994, Current Anthropology.

[19]  M. Swadesh Lexico-Statistical Dating of Prehistoric Ethnic Contacts , 1952 .

[20]  R. Gray,et al.  Language-tree divergence times support the Anatolian theory of Indo-European origin , 2003, Nature.

[21]  M. Stoneking,et al.  Polynesian genetic affinities with Southeast Asian populations as identified by mtDNA analysis. , 1995, American journal of human genetics.

[22]  P. Hage,et al.  Matrilineality and the Melanesian Origin of Polynesian Y Chromosomes1 , 2003 .

[23]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[24]  Charles O. Frake,et al.  Philippine minor languages;: Word lists and phonologies, , 1972, The Journal of Asian Studies.

[25]  M. Pagel,et al.  THE COMPARATIVE METHOD IN ANTHROPOLOGY , 1994 .

[26]  Sheila Embleton,et al.  Statistics in historical linguistics , 1986 .

[27]  Andrew Pawley,et al.  The Austronesian dispersal: languages, technologies, people , 2002 .

[28]  Alan Jones,et al.  Mekeo Chiefs and Sorcerers: Metaphor, Ideology and Practice , 2007 .

[29]  Terry Crowley,et al.  Naman: a vanishing language of Malakula (Vanuatu) , 2006 .

[30]  P. Lewis Ethnologue : languages of the world , 2009 .

[31]  R R Sokal,et al.  Zones of sharp genetic change in Europe are also linguistic boundaries. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Juliette Blevins,et al.  A Long Lost Sister of Proto-Austronesian?: Proto-Ongan, Mother of Jarawa and Onge of the Andaman Islands , 2007 .

[33]  August Hardeland,et al.  Dajacksch-deutsches Wörterbuch , 1859 .

[34]  Sean S. Downey,et al.  Coevolution of languages and genes on the island of Sumba, eastern Indonesia , 2007, Proceedings of the National Academy of Sciences.

[35]  M. Ruhlen A Guide to the World’s Languages , 1987 .

[36]  W. John Kress,et al.  Speaking of Forked Tongues: The Feasibility of Reconciling Human Phylogeny and the History of Language [and Comments] , 1990, Current Anthropology.

[37]  B. Rannala,et al.  Phylogenetic methods come of age: testing hypotheses in an evolutionary context. , 1997, Science.

[38]  Sydney Parkinson,et al.  A journal of a voyage to the South Seas, in his Majesty's ship,: The Endeavour , 1984 .

[39]  A. Neeleman,et al.  Radical Pro Drop and the Morphology of Pronouns , 2007, Linguistic Inquiry.

[40]  Michael Y. Galperin The Molecular Biology Database Collection: 2005 update , 2004, Nucleic Acids Res..

[41]  Andrew Pawley,et al.  Man and a half : essays in Pacific anthropology and ethnobiology in honour of Ralph Bulmer , 1991 .

[42]  J. Long,et al.  A formal test of linguistic and genetic coevolution in native Central and South America. , 2007, American journal of physical anthropology.

[43]  R R Sokal,et al.  Genetic, geographic, and linguistic distances in Europe. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Robert McMahon,et al.  Genetics, Historical Linguistics and Language Variation , 2008, Lang. Linguistics Compass.

[45]  A Piazza,et al.  Reconstruction of human evolution: bringing together genetic, archaeological, and linguistic data. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[46]  R. O'HARA,et al.  Homage to Clio, or, Toward an Historical Philosophy for Evolutionary Biology , 1988 .

[47]  Sarah L. Nesbeitt Ethnologue: Languages of the World , 1999 .

[48]  E. Martins The Comparative Method in Evolutionary Biology, Paul H. Harvey, Mark D. Pagel. Oxford University Press, Oxford (1991), vii, + 239 Price $24.95 paperback , 1992 .

[49]  Simon J. Greenhill,et al.  Languages Evolve in Punctuational Bursts , 2008, Science.

[50]  Robert Blust,et al.  The Austronesian Languages , 2009 .

[51]  M. Pagel Inferring the historical patterns of biological evolution , 1999, Nature.

[52]  Raleigh Ferrell,et al.  Taiwan aboriginal groups : problems in cultural and linguistic classification , 1969 .

[53]  James J. Fox,et al.  The Austronesians: Historical and Comparative Perspectives , 1997 .

[54]  J V Neel,et al.  Regional linguistic and genetic differences among Yanomama indians. , 1974, Science.

[55]  Terry Crowley,et al.  Tape : a declining language of Malakula (Vanuatu) , 2006 .

[56]  L. Joan Vanishing Voices: The Extinction of the World's Languages. , 2004 .

[57]  R. Mace,et al.  The evolution of cultural diversity : a phylogenetic approach , 2005 .

[58]  J. Lawrence Angel,et al.  Population Distances: Biological, Linguistic, Geographical, and Environmental [and Comments and Reply] , 1966, Current Anthropology.

[59]  J. Diamond,et al.  Farmers and Their Languages: The First Expansions , 2003, Science.

[60]  Yukihiro Yamada,et al.  Lists of Selected Words of Batanic Languages , 1997 .

[61]  S. Pääbo,et al.  Mitochondrial genome variation and the origin of modern humans , 2000, Nature.

[62]  Terry Crowley,et al.  The Avava language of central Malakula (Vanuatu) , 2006 .

[63]  Robert Blust,et al.  SOME REMARKS ON THE LINGUISTIC POSITION OF THAO , 1996 .

[64]  Russell D. Gray,et al.  Language trees support the express-train sequence of Austronesian expansion , 2000, Nature.

[65]  C. O. Blagden Dictionnaire Čam-Français. Par Etienne Aymonier , Résident Supérieur Honoraire, Ancien Directeur de l'Ecole Coloniale, et Antoine Cabaton, Attaché à la Bibliothèque Nationale, Ancien Membre de l'Ecole Française d'Extrême Orient. (Paris: Imprimerie Nationale, Ernest Leroux, 1906.) , 1907, Journal of the Royal Asiatic Society.

[66]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..