Vertical and horizontal transmission in language evolution

It has been observed that borrowing within a group of genetically related languages often causes the lexical similarities among them to be skewed. Consequently, it has been proposed that borrowing can sometimes be inferred from such skewing. However, heterogeneity in the rate of lexical replacement, as well as borrowing from other languages, can also give rise to skewed lexical similarities. It is important, therefore, to determine to what degree skewing is a statistically significant indicator of borrowing. Here, we describe a statistical hypothesis test for detecting language contact based on skewing of linguistic characters of arbitrary type. Significant probabilities of correct detection of contact are maintained for various contact scenarios, with low false alarm probability. Our experiments show that the test is fairly robust to substantial heterogeneity in the retention rate, both across characters and across lineages, suggesting that the method can provide an objective criterion against which claims of significant skewing due to contact can be tested, pointing the way for more detailed analysis.

[1]  D. Sankoff Reconstructing the History and Geography of an Evolutionary Tree , 1972 .

[2]  M. Swadesh Diffusional Cumulation and Archaic Residue as Historical Explanations , 1951, Southwestern Journal of Anthropology.

[3]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[4]  Esra Erdem,et al.  Reconstructing the Evolutionary History of Indo-European Languages Using Answer Set Programming , 2003, PADL.

[5]  Tandy Warnow,et al.  Indo‐European and Computational Cladistics , 2002 .

[6]  Georg Renatus Solta Die Stellung des Armenischen im Kreise der indogermanischen Sprachen , 1960 .

[7]  M. Swadesh Towards Greater Accuracy in Lexicostatistic Dating , 1955, International Journal of American Linguistics.

[8]  Feng Wang Language contact and language comparison : the case of Bai , 2004 .

[9]  Esra Erdem,et al.  Character-Based Cladistics and Answer Set Programming , 2005, PADL.

[10]  Laurent Sagart,et al.  HISTORY THROUGH LOANWORDS , 2001 .

[11]  Tandy J. Warnow,et al.  Reconstructing the evolutionary history of natural languages , 1996, SODA '96.

[12]  James W. Minett,et al.  On detecting borrowing: distance-based and character-based , 2003 .

[13]  J. Kruskal,et al.  An Indoeuropean classification : a lexicostatistical experiment , 1992 .

[14]  A. Dress,et al.  A canonical decomposition theory for metrics on a finite set , 1992 .

[15]  D. F. Roberts,et al.  The History and Geography of Human Genes , 1996 .