COMPARATIVE METHOD ALGORITHM

COMPARATIVE METHOD ALGORITHM Reconstructing proto-languages has been one of the goals of historical linguistics since the mid nineteenth century. The Comparative Method Algorithm (CMA) uses recent advances in computational linguistics to present a fully automated version of the classic comparative method used by linguists today. The program takes as input lists of words in related languages and a tree representing their relationship, and it uses this input to find cognate groups and reconstruct proto-words. For each reconstructed phoneme, the algorithm also provides a confidence estimate, which reflects the strength of the evidence for this reconstruction. The CMA provides a simpler, more flexible, and more transparent model than previous approaches, and it rivals those approaches in the quality of reconstructions produced.

[1]  Larry Trask,et al.  The Dictionary of Historical and Comparative Linguistics , 2000 .

[2]  Mikael Parkvall,et al.  The beginnings of a database for historical sound change , 2008 .

[3]  R. Blust CENTRAL AND CENTRAL- EASTERN MALAYO-POLYNESIAN , 1993 .

[4]  Michael P. Oakes,et al.  Computer Estimation of Vocabulary in a Protolanguage from Word Lists in Four Daughter Languages , 2000, J. Quant. Linguistics.

[5]  T. Warnow,et al.  A STOCHASTIC MODEL OF LANGUAGE EVOLUTION THAT INCORPORATES HOMOPLASY AND BORROWING , 2005 .

[6]  Johann-Mattis List,et al.  SCA: Phonetic Alignment Based on Sound Classes , 2011, ESSLLI Student Sessions.

[7]  Simon J. Greenhill,et al.  The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics , 2008, Evolutionary bioinformatics online.

[8]  Grzegorz Kondrak,et al.  Clustering Semantically Equivalent Words into Cognate Sets in Multilingual Lists , 2011, IJCNLP.

[9]  Donald A. Ringe join On Calculating the Factor of Chance in Language Comparison , 1992 .

[10]  Klaas Willems,et al.  Naturalness and iconicity in language , 2008 .

[11]  Donald Arthur Ringe On Calculating the Factor of Chance in Language Comparison , 1992 .

[12]  Phil Blunsom,et al.  Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics , 2009 .

[13]  Steven Abney,et al.  Linguistic Issues in Language Technology LiLT , 2011 .

[14]  Mark Johnson,et al.  How relevant is linguistics to computational linguistics , 2011 .

[15]  Mahé Ben Hamed,et al.  UNIDIA: a database for deriving Diachronic Universals , 2009 .

[16]  Johann-Mattis List,et al.  LexStat: Automatic Detection of Cognates in Multilingual Wordlists , 2012, EACL 2012.

[17]  Dan Klein,et al.  Improved Reconstruction of Protolanguage Word Forms , 2009, NAACL.

[18]  Wolfgang Ullrich Wurzel,et al.  Inflectional Morphology and Naturalness , 1989 .

[19]  E. Pulgram Proto-Indo-European Reality and Reconstruction , 1959 .

[20]  Michael Cysouw,et al.  A Pipeline for Computational Historical Linguistics , 2011 .

[21]  David Eddington,et al.  Linguistics and the Scientific Method , 2008 .

[22]  Dan Klein,et al.  A Probabilistic Approach to Diachronic Phonology , 2007, EMNLP-CoNLL.

[23]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[24]  Bernd Nothofer,et al.  The reconstruction of Proto-Malayo-Javanic , 1975 .

[25]  Dan Klein,et al.  Automated reconstruction of ancient languages using probabilistic models of sound change , 2013, Proceedings of the National Academy of Sciences.

[26]  Marija Slavkovik,et al.  New Directions in Logic, Language and Computation , 2012, Lecture Notes in Computer Science.

[27]  Anthony Fox,et al.  Linguistic Reconstruction: An Introduction to Theory and Method , 1995 .

[28]  Cecil H. Brown,et al.  Sound Correspondences in the World's Languages , 2013 .

[29]  Anthony Arlotto,et al.  Introduction to Historical Linguistics , 1971 .

[30]  St. Louis PHONETIC COMPARISON ALGORITHMS , .

[31]  Antonella Delmestri,et al.  Data Driven Models for Language Evolution , 2011 .