Network analysis reveals structure indicative of syntax in the corpus of undeciphered Indus civilization inscriptions

Archaeological excavations in the sites of the Indus Valley civilization (2500-1900 BCE) in Pakistan and northwestern India have unearthed a large number of artifacts with inscriptions made up of hundreds of distinct signs. To date, there is no generally accepted decipherment of these sign sequences, and there have been suggestions that the signs could be non-linguistic. Here we apply complex network analysis techniques on the data-base of available Indus inscriptions, with the aim of detecting patterns indicative of syntactic structure in this sign system. Our results show the presence of regularities, e.g., in the segmentation trees of the sequences, that suggest the existence of a grammar underlying the construction of the sequences.

[1]  A. Parpola,et al.  Deciphering the Indus Script , 1996 .

[2]  M. Vitevitch What can graph theory tell us about word learning and lexical retrieval? , 2008, Journal of speech, language, and hearing research : JSLHR.

[3]  Dragomir R. Radev,et al.  Networks and Natural Language Processing , 2008, AI Mag..

[4]  J. Marshall,et al.  Mohenjo-daro and the Indus civilization : being an official account of archaeological excavations at Mohenjo-daro carried out by the Government of India between the years 1922 and 1927 , 1983 .

[5]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  W. Fairservis,et al.  The roots of ancient India , 1971 .

[7]  Alexander Mehler Large Text Networks as an Object of Corpus Linguistic Studies , 2009 .

[8]  Reinhard Köhler,et al.  Patterns in syntactic dependency networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Ramon Ferrer i Cancho,et al.  The small world of human language , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[10]  Richard Sproat,et al.  The Collapse of the Indus-Script Thesis: The Myth of a Literate Harappan Civilization , 2004 .

[11]  Mirella Lapata,et al.  Natural Language Processing and the Web , 2008, IEEE Intell. Syst..

[12]  Partha Dasgupta,et al.  Topology of the conceptual network of language. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  S N Dorogovtsev,et al.  Language as an evolving word web , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[14]  Nisha Yadav,et al.  A STATISTICAL APPROACH FOR PATTERN SEARCH IN INDUS WRITING , 2008 .

[15]  P. Holme Core-periphery organization of complex networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  S. M.G. Caldeira,et al.  The network of concepts in written texts , 2005, physics/0508066.