Network analysis of a corpus of undeciphered Indus civilization inscriptions indicates syntactic organization

Archaeological excavations in the sites of the Indus Valley civilization (2500-1900 BCE) in Pakistan and northwestern India have unearthed a large number of artifacts with inscriptions made up of hundreds of distinct signs. To date, there is no generally accepted decipherment of these sign sequences, and there have been suggestions that the signs could be non-linguistic. Here we apply complex network analysis techniques to a database of available Indus inscriptions, with the aim of detecting patterns indicative of syntactic organization. Our results show the presence of patterns, e.g., recursive structures in the segmentation trees of the sequences, that suggest the existence of a grammar underlying these inscriptions.

[1]  A. Parpola,et al.  Deciphering the Indus Script , 1996 .

[2]  Dragomir R. Radev,et al.  Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing , 2006 .

[3]  Asko Parpola,et al.  Corpus of Indus Seals and Inscriptions, Vol. 2: Collections in Pakistan , 1993 .

[4]  Dragomir R. Radev,et al.  Networks and Natural Language Processing , 2008, AI Mag..

[5]  Partha Dasgupta,et al.  Topology of the conceptual network of language. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Asko Parpola,et al.  Corpus of Indus Seals and Inscriptions. 1. Collections in India , 1987 .

[7]  M. Vitevitch What can graph theory tell us about word learning and lexical retrieval? , 2008, Journal of speech, language, and hearing research : JSLHR.

[8]  P. Holme Core-periphery organization of complex networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  N. Hammond,et al.  Breaking the Maya Code , 1993 .

[10]  Reinhard Köhler,et al.  Patterns in syntactic dependency networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Ramon Ferrer i Cancho,et al.  The small world of human language , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[12]  S N Dorogovtsev,et al.  Language as an evolving word web , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[13]  J. Marshall,et al.  Mohenjo-daro and the Indus civilization : being an official account of archaeological excavations at Mohenjo-daro carried out by the Government of India between the years 1922 and 1927 , 1983 .

[14]  Raj Kumar Pan,et al.  Network analysis reveals structure indicative of syntax in the corpus of undeciphered Indus civilization inscriptions , 2009, Graph-based Methods for Natural Language Processing.

[15]  Nisha Yadav,et al.  A STATISTICAL APPROACH FOR PATTERN SEARCH IN INDUS WRITING , 2008 .

[16]  S. M.G. Caldeira,et al.  The network of concepts in written texts , 2005, physics/0508066.

[17]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[18]  Diego Garlaschelli,et al.  Patterns of link reciprocity in directed networks. , 2004, Physical review letters.

[19]  I. J. Gelb A study of writing , 1954 .

[20]  John Baines,et al.  The earliest Egyptian writing: development, context, purpose , 2001 .

[21]  Anke Lüdeling,et al.  Corpus Linguistics: An International Handbook , 2009 .

[22]  C. Shalizi The Domestication of the Savage Mind , 2009 .

[23]  W. Fairservis,et al.  The roots of ancient India , 1971 .

[24]  Alexander Mehler Large Text Networks as an Object of Corpus Linguistic Studies , 2009 .

[25]  John Marshall,et al.  Mohenjo-daro And The Indus Civilization Vol.ii , 1931 .

[26]  B. Trigger,et al.  Writing systems: A case study in cultural evolution , 1998 .

[27]  Geoffrey Sampson,et al.  Writing Systems: A Linguistic Introduction , 1986 .

[28]  Richard Sproat,et al.  The Collapse of the Indus-Script Thesis: The Myth of a Literate Harappan Civilization , 2004 .