Non-crossing dependencies: least effort, not grammar

The use of null hypotheses (in a statistical sense) is common in hard sciences but not in theoretical linguistics. Here the null hypothesis that the low frequency of syntactic dependency crossings is expected by an arbitrary ordering of words is rejected. It is shown that this would require star dependency structures, which are both unrealistic and too restrictive. The hypothesis of the limited resources of the human brain is revisited. Stronger null hypotheses taking into account actual dependency lengths for the likelihood of crossings are presented. Those hypotheses suggests that crossings are likely to reduce when dependencies are shortened. A hypothesis based on pressure to reduce dependency lengths is more parsimonious than a principle of minimization of crossings or a grammatical ban that is totally dissociated from the general and non-linguistic principle of economy.

[1]  Steven T Piantadosi,et al.  Word lengths are optimized for efficient communication , 2011, Proceedings of the National Academy of Sciences.

[2]  Lothar Krempel,et al.  The Language of Networks , 2004 .

[3]  E. Gibson,et al.  Weak quantitative standards in linguistics research , 2010, Trends in Cognitive Sciences.

[4]  Raymond J. Mooney,et al.  Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing , 2005 .

[5]  Gabriel Altmann,et al.  Quantitative Linguistik / Quantitative Linguistics: Ein internationales Handbuch , 2005 .

[6]  Haitao Liu,et al.  Dependency Distance as a Metric of Language Comprehension Difficulty , 2008 .

[7]  Haitao Liu,et al.  The risks of mixing dependency lengths from sequences of different length , 2013, ArXiv.

[8]  Glyn Morrill,et al.  Dutch Grammar and Processing: A Case Study in TLG , 2007, TbiLLC.

[9]  Haitao Liu,et al.  Dependency direction as a means of word-order typology: A method based on dependency treebanks , 2010 .

[10]  J. Herskowitz,et al.  Proceedings of the National Academy of Sciences, USA , 1996, Current Biology.

[11]  Morten H. Christiansen,et al.  How hierarchical is language use? , 2012, Proceedings of the Royal Society B: Biological Sciences.

[12]  M. Newman,et al.  Random graphs with arbitrary degree distributions and their applications. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Ricard V. Solé Genome size, self-organization and DNA's dark matter , 2010 .

[14]  Guram Bezhanishvili,et al.  Logic, Language, and Computation , 2009, Lecture Notes in Computer Science.

[15]  Z. Harris,et al.  Foundations of Language , 1940 .

[16]  R. Solé,et al.  Optimization in Complex Networks , 2001, cond-mat/0111222.

[17]  Nick Chater,et al.  Networks in Cognitive Science , 2013, Trends in Cognitive Sciences.

[18]  S. Levinson,et al.  The myth of language universals: language diversity and its importance for cognitive science. , 2009, The Behavioral and brain sciences.

[19]  Marc Noy,et al.  Enumeration of noncrossing trees on a circle , 1998, Discret. Math..

[20]  D. G. Hays Dependency Theory: A Formalism and Some Observations , 1964 .

[21]  Haitao Liu,et al.  Probability distribution of dependency distance , 2007, Glottometrics.

[22]  M. Tomasello The new psychology of language : cognitive and functional approaches to language structure , 1998 .

[23]  Susan Foster-Cohen,et al.  RHYME AND REASON: AN INTRODUCTION TO MINIMALIST SYNTAX.Juan Uriagareka. Cambridge, MA: MIT Press, 1998. Pp. xlii + 669. $75.00 cloth, $45.00 paper. , 2001, Studies in Second Language Acquisition.

[24]  John A. Hawkins,et al.  A Performance Theory of Order and Constituency , 1995 .

[25]  Richard Hudson,et al.  Language Networks: The New Word Grammar , 2007 .

[26]  Ramon Ferrer-i-Cancho,et al.  Hubiness, length, crossings and their relationships in dependency trees , 2013, ArXiv.

[27]  Y. Miyashita,et al.  Image, language, brain , 2000 .

[28]  Eugene Galanter,et al.  Handbook of mathematical psychology: I. , 1963 .

[29]  Edward Gibson,et al.  Quantitative Standards for Absolute Linguistic Universals , 2014, Cogn. Sci..

[30]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[31]  Tamar Frankel [The theory and the practice...]. , 2001, Tijdschrift voor diergeneeskunde.

[32]  R. Ferrer i Cancho Why do syntactic links not cross , 2006 .

[33]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[34]  Morten H. Christiansen,et al.  Processing multiple non-adjacent dependencies: evidence from sequence learning , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[35]  Olga Kostopoulou,et al.  Proceedings of the 35th annual conference of the Cognitive Science Society , 2013 .

[36]  B. Bollobás,et al.  Combinatorics, Probability and Computing , 2006 .

[37]  Z. Harris,et al.  Foundations of language , 1941 .

[38]  Ramon Ferrer-i-Cancho,et al.  Random Texts Do Not Exhibit the Real Zipf's Law-Like Rank Distribution , 2010, PloS one.

[39]  Miao‐kun Sun,et al.  Trends in cognitive sciences , 2012 .

[40]  O. Bagasra,et al.  Proceedings of the National Academy of Sciences , 1914, Science.

[41]  G. A. Miller,et al.  Finitary models of language users , 1963 .

[42]  G. Carpenter,et al.  Behavioral and Brain Sciences , 1999 .

[43]  Ramon Ferrer-i-Cancho Hubiness, length and crossings in syntactic dependencies , 2013, Glottometrics.

[44]  J. Moon Counting labelled trees , 1970 .

[45]  M. Gell-Mann,et al.  The origin and evolution of word order , 2011, Proceedings of the National Academy of Sciences.

[46]  David Aldous,et al.  The Random Walk Construction of Uniform Spanning Trees and Uniform Labelled Trees , 1990, SIAM J. Discret. Math..

[47]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[48]  Edward Gibson,et al.  The processing of extraposed structures in English , 2012, Cognition.

[49]  Ramon Ferrer-i-Cancho,et al.  Why SOV might be initially preferred and then lost or recovered? A theoretical framework , 2013, ArXiv.

[50]  Anna Siewierska,et al.  Constituent order in the languages of Europe , 1998 .

[51]  Victor M. Darriba,et al.  Undirected Dependency Parsing , 2015, Comput. Intell..

[52]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.

[53]  Noam Chomsky,et al.  The faculty of language: what is it, who has it, and how did it evolve? , 2002 .

[54]  Frederick J. Newmeyer,et al.  The Prague School and North American functionalist approaches to syntax , 2001 .

[55]  Chen Guo-an Image or Language , 2002 .

[56]  Frederick J. Newmeyer,et al.  Grammar is Grammar and Usage is Usage , 2003 .

[57]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[58]  E. Gibson The dependency locality theory: A distance-based theory of linguistic complexity. , 2000 .

[59]  Robert Hochberg,et al.  Optimal one-page tree embeddings in linear time , 2003, Inf. Process. Lett..

[60]  Alan Frieze,et al.  Random Structures and Algorithms , 2014 .

[61]  Bruce A. Reed,et al.  The Size of the Giant Component of a Random Graph with a Given Degree Sequence , 1998, Combinatorics, Probability and Computing.

[62]  Nick Chater,et al.  Toward a connectionist model of recursion in human linguistic performance , 1999 .

[63]  John A. Hawkins Some issues in a performance theory of word order , 1998 .

[64]  Partha Niyogi,et al.  A Note on Zipf's Law, Natural Languages, and Noncoding DNA regions , 1995, ArXiv.

[65]  Gabriel Altmann,et al.  Quantitative Linguistik : ein internationales Handbuch , 2005 .

[66]  Luke McCrohon,et al.  The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11) , 2016 .

[67]  William D. Marslen-Wilson,et al.  Crossed and nested dependencies in German and Dutch , 1986 .

[68]  G. Miller,et al.  Cognitive science. , 1981, Science.

[69]  Juan Uriagereka Rhyme and reason , 1998 .

[70]  David Lusseau,et al.  Compression as a Universal Principle of Animal Behavior , 2013, Cogn. Sci..

[71]  Ramon Ferrer-i-Cancho,et al.  Constant conditional entropy and related hypotheses , 2013, ArXiv.

[72]  Tessa C. Warren,et al.  READING-TIME EVIDENCE FOR INTERMEDIATE LINGUISTIC STRUCTURE IN LONG-DISTANCE DEPENDENCIES , 2004 .

[73]  Nick Cercone,et al.  Computational Linguistics , 1986, Communications in Computer and Information Science.

[74]  Glyn Morrill,et al.  Incremental processing and acceptability , 2000, CL.

[75]  G. Zipf The Psycho-Biology Of Language: AN INTRODUCTION TO DYNAMIC PHILOLOGY , 1999 .

[76]  E. V. Bergen,et al.  Proceedings of the Royal Society B : Biological Sciences , 2013 .

[77]  Peter W. Culicover,et al.  Quantitative methods alone are not enough: Response to Gibson and Fedorenko , 2010, Trends in Cognitive Sciences.

[78]  Christian Borgelt,et al.  Computational Intelligence , 2016, Texts in Computer Science.

[79]  Christian M. Reidys,et al.  Random k-noncrossing RNA structures , 2009, Proceedings of the National Academy of Sciences.

[80]  John R. Buck,et al.  The use of Zipf's law in animal communication analysis , 2005, Animal Behaviour.

[81]  Andrei Z. Broder,et al.  Generating random spanning trees , 1989, 30th Annual Symposium on Foundations of Computer Science.

[82]  J. Hawkins Efficiency and complexity in grammars , 2004 .

[83]  Ramon Ferrer-i-Cancho,et al.  The challenges of statistical patterns of language: The case of Menzerath's law in genomes , 2012, Complex..

[84]  J. Zwart The Minimalist Program , 1998, Journal of Linguistics.

[85]  J. Urry Complexity , 2006, Interpreting Art.

[86]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[87]  Béla Bollobás,et al.  Modern Graph Theory , 2002, Graduate Texts in Mathematics.

[88]  Adele E. Goldberg Constructions: a new theoretical approach to language , 2003, Trends in Cognitive Sciences.

[89]  D. Saad Europhysics Letters , 1997 .

[90]  Fan Chung,et al.  On optimal linear arrangements of trees , 1984 .

[91]  R. F. Cancho Euclidean distance between syntactically linked words. , 2004 .

[92]  David R. Anderson,et al.  Model Selection and Multimodel Inference , 2003 .