Can chunking reduce syntactic complexity of natural languages?

Natural language is a complex adaptive system with multiple levels. The hierarchical structure may have much to do with the complexity of language. Dependency Distance has been invoked to explain various linguistic patterns regarding syntactic complexity. However, little attention has been paid to how the structural properties of language to minimize dependency distance. This article computationally simulates several chunked artificial languages, and shows, through comparison with Mandarin Chinese, that chunking may significantly reduce mean dependency distance of linear sequences. These results suggest that language may have evolved the mechanism of dynamic chunking to reduce the complexity for the sake of efficient communication. © 2016 Wiley Periodicals, Inc. Complexity 21: 33–41, 2016

[1]  Ramon Ferrer-i-Cancho,et al.  Hubiness, length, crossings and their relationships in dependency trees , 2013, ArXiv.

[2]  Ricard V. Solé,et al.  Language networks: Their structure, function, and evolution , 2010 .

[3]  Hans Jürgen Heringer,et al.  Syntax : Fragen, Lösungen, Alternativen , 1980 .

[4]  Tao Gong,et al.  Coevolution of lexicon and syntax from a simulation perspective , 2005, Complex..

[5]  Ramon Ferrer-i-Cancho Hubiness, length and crossings in syntactic dependencies , 2013, Glottometrics.

[6]  Gary Jones,et al.  Why Chunking Should be Considered as an Explanation for Developmental Change before Short-Term Memory Capacity and Processing Speed , 2012, Front. Psychology.

[7]  Nick Chater,et al.  The Now-or-Never bottleneck: A fundamental constraint on language , 2015, Behavioral and Brain Sciences.

[8]  Ramon Ferrer-i-Cancho,et al.  Crossings as a side effect of dependency lengths , 2015, Complex..

[9]  Ryan Keith Shosted,et al.  Correlating complexity: A typological approach , 2006 .

[10]  David Bruce Wilson,et al.  Generating random spanning trees more quickly than the cover time , 1996, STOC '96.

[11]  G. Zipf,et al.  Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. , 1949 .

[12]  Ramon Ferrer-i-Cancho,et al.  A stronger null hypothesis for crossing dependencies , 2014, ArXiv.

[13]  Peng Jin,et al.  Multi-view Chinese Treebanking , 2014, COLING.

[14]  Ricard V. Solé,et al.  Language networks: Their structure, function, and evolution , 2007, Complex..

[15]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[16]  Ramon Ferrer-i-Cancho,et al.  Some Word Order Biases from Limited Brain Resources: a Mathematical Approach , 2008, Adv. Complex Syst..

[17]  Zdeněk Žabokrtský,et al.  The role of syntax in complex networks: Local and global importance of verbs in a syntactic dependen , 2011 .

[18]  Richard Hudson,et al.  Language Networks: The New Word Grammar , 2007 .

[19]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[20]  Lan Shuai,et al.  Modelling language evolution: Examples and predictions. , 2014, Physics of life reviews.

[21]  Haitao Liu,et al.  What role does syntax play in a language network , 2008 .

[22]  R. F. Cancho Euclidean distance between syntactically linked words. , 2004 .

[23]  Shiyong Liu,et al.  The Design of an Urban Roadside Automatic Sprinkling System: Mitigation of PM2.5–10 in Ambient Air in Megacities , 2014 .

[24]  G. A. Miller,et al.  Verbal context and the recall of meaningful material. , 1950, The American journal of psychology.

[25]  Ferdinand de Saussure Course in General Linguistics , 1916 .

[26]  Ramon Ferrer-i-Cancho,et al.  Non-crossing dependencies: least effort, not grammar , 2014, ArXiv.

[27]  Carlos Gershenson,et al.  Complexity measurement of natural and artificial languages , 2013, Complex..

[28]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[29]  Ian Maddieson,et al.  Issues of Phonological Complexity: Statistical Analysis of the Relationship Between Syllable Structures, Segment Inventories and Tone Contrasts , 2005 .

[30]  R. Langacker Foundations of cognitive grammar , 1983 .

[31]  N. Cowan The magical number 4 in short-term memory: A reconsideration of mental storage capacity , 2001, Behavioral and Brain Sciences.

[32]  D. G. Hays Dependency Theory: A Formalism and Some Observations , 1964 .

[33]  Haitao Liu,et al.  Probability distribution of dependency distance , 2007, Glottometrics.

[34]  Richard Hudson,et al.  An Introduction to Word Grammar , 2010 .

[35]  Haitao Liu,et al.  Quantitative Syntax Analysis Reinhard Köhler (Trier University) Berlin and Boston: De Gruyter Mouton (Quantitative Linguistics series, edited by Reinhard Köhler, Gabriel Altmann, and Peter Grzybek, volume 65), 2012, x+224 pp, hardbound, ISBN 978-3-11-027219-2, €99.95, $140.00 , 2012, Computational Linguistics.

[36]  Steven Abney,et al.  Parsing By Chunks , 1991 .

[37]  Andrij A. Rovenchak,et al.  Menzerath–Altmann Law for Syntactic Structures in Ukrainian , 2007, ArXiv.

[38]  R. Ferrer i Cancho Why do syntactic links not cross , 2006 .

[39]  Haitao Liu,et al.  The effects of sentence length on dependency distance, dependency direction and the implications–Based on a parallel English–Chinese dependency treebank , 2015 .

[40]  Haitao Liu,et al.  Dependency Distance as a Metric of Language Comprehension Difficulty , 2008 .

[41]  J. R. Wilson,et al.  The Fongen-Hyllingen Layered Intrusive Complex, Norway , 1996 .