Malagasy dialects and the peopling of Madagascar

The origin of Malagasy DNA is half African and half Indonesian, nevertheless the Malagasy language, spoken by the entire population, belongs to the Austronesian family. The language most closely related to Malagasy is Maanyan (Greater Barito East group of the Austronesian family), but related languages are also in Sulawesi, Malaysia and Sumatra. For this reason, and because Maanyan is spoken by a population which lives along the Barito river in Kalimantan and which does not possess the necessary skill for long maritime navigation, the ethnic composition of the Indonesian colonizers is still unclear. There is a general consensus that Indonesian sailors reached Madagascar by a maritime trek, but the time, the path and the landing area of the first colonization are all disputed. In this research, we try to answer these problems together with other ones, such as the historical configuration of Malagasy dialects, by types of analysis related to lexicostatistics and glottochronology that draw upon the automated method recently proposed by the authors. The data were collected by the first author at the beginning of 2010 with the invaluable help of Joselinà Soafara Néré and consist of Swadesh lists of 200 items for 23 dialects covering all areas of the island.

[1]  M. Hurles,et al.  The dual origin of the Malagasy in Island Southeast Asia and East Africa: evidence from maternal and paternal lineages. , 2005, American journal of human genetics.

[2]  Filippo Petroni,et al.  Measures of lexical distance between languages , 2009, ArXiv.

[3]  Viveka Velupillai,et al.  Homelands of the world’s language families: a quantitative approach , 2010 .

[4]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[5]  Simon J. Greenhill,et al.  Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement , 2009, Science.

[6]  M. Swadesh Lexico-Statistical Dating of Prehistoric Ethnic Contacts , 1952 .

[7]  Edward Sapir,et al.  Time Perspective in Aboriginal American Culture: A Study in Method , 2008 .

[8]  Filippo Petroni,et al.  Geometric representations of language taxonomies , 2009, Comput. Speech Lang..

[9]  K. Adelaar Chapter 4. Borneo as a Cross-Roads for Comparative Austronesian Linguistics , 2006 .

[10]  Eric W. Holman,et al.  Evaluating linguistic distance measures , 2010 .

[11]  V. Velupillai,et al.  Homelands of the world’s language families , 2012 .

[12]  Sarah C. Gudschinsky The ABC'S of Lexicostatistics (Glottochronology) , 1956 .

[13]  Philippe Blanchard,et al.  Markov chains or the game of structure and chance , 2010 .

[14]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[15]  M. Serva,et al.  Indo-European languages tree by Levenshtein distance , 2007, 0708.2971.

[16]  Filippo Petroni,et al.  Language distance and tree reconstruction , 2008 .

[17]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[18]  P. Forster,et al.  Phylogenetic Methods and the Prehistory of Languages , 2006 .

[19]  Isidore Dyen,et al.  Malgache et maanjan: Une comparaison linguistique , 1953 .

[20]  Russell D. Gray,et al.  Language trees support the express-train sequence of Austronesian expansion , 2000, Nature.

[21]  R. Blench,et al.  The vocabularies of Vazimba and Beosi : do they represent the languages of the pre-Austronesian populations of Madagascar ? , 2010 .

[22]  I. Jolliffe Principal Component Analysis , 2002 .

[23]  Simon J. Greenhill,et al.  The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics , 2008, Evolutionary bioinformatics online.

[24]  Cecil H. Brown,et al.  Adding typology to lexicostatistics: A combined approach to language classification , 2009 .

[25]  A. Adelaar Asian roots of the Malagasy; A linguistic perspective , 1995 .

[26]  Pierre Verin,et al.  The glottochronology of Malagasy speech communities , 1975 .

[27]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[28]  Søren Wichmann,et al.  Explorations in automated language classification , 2008 .

[29]  Luay Nakhleh,et al.  An experimental study comparing linguistic phylogenetic reconstruction methods , 2013 .

[30]  M. Swadesh Towards Greater Accuracy in Lexicostatistic Dating , 1955, International Journal of American Linguistics.