The scarcity of crossing dependencies: a direct outcome of a specific constraint?

The structure of a sentence can be represented as a network where vertices are words and edges indicate syntactic dependencies. Interestingly, crossing syntactic dependencies have been observed to be infrequent in human languages. This leads to the question of whether the scarcity of crossings in languages arises from an independent and specific constraint on crossings. We provide statistical evidence suggesting that this is not the case, as the proportion of dependency crossings of sentences from a wide range of languages can be accurately estimated by a simple predictor based on a null hypothesis on the local probability that two dependencies cross given their lengths. The relative error of this predictor never exceeds 5% on average, whereas the error of a baseline predictor assuming a random ordering of the words of a sentence is at least six times greater. Our results suggest that the low frequency of crossings in natural languages is neither originated by hidden knowledge of language nor by the undesirability of crossings per se, but as a mere side effect of the principle of dependency length minimization.

[1]  Christina Freytag The Formal Complexity Of Natural Language , 2016 .

[2]  Marie Mikulová,et al.  Prague Dependency Treebank 3.0 , 2013 .

[3]  Giorgio Satta,et al.  Optimal Reduction of Rule Length in Linear Context-Free Rewriting Systems , 2009, NAACL.

[4]  Ramon Ferrer-i-Cancho,et al.  A stronger null hypothesis for crossing dependencies , 2014, ArXiv.

[5]  W. Marsden I and J , 2012 .

[6]  Yuji Matsumoto,et al.  Japanese Dependency Parsing Using a Tournament Model , 2008, COLING.

[7]  Laura Kallmeyer,et al.  Proceedings of the Eighth International Workshop on Tree Adjoining Grammar and Related Formalisms , 2006 .

[8]  Michael Meeuwis,et al.  Order of subject, object, and verb , 2013 .

[9]  Oliver Watts,et al.  Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) , 2014 .

[10]  Roger Levy,et al.  Minimal-length linearizations for mildly context-sensitive dependency trees , 2009, NAACL.

[11]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[12]  Emily Pitler,et al.  A Crossing-Sensitive Third-Order Factorization for Dependency Parsing , 2014, TACL.

[13]  J. Herskowitz,et al.  Proceedings of the National Academy of Sciences, USA , 1996, Current Biology.

[14]  Stuart M. Shieber,et al.  Evidence against the context-freeness of natural language , 1985 .

[15]  Ramon Ferrer-i-Cancho,et al.  Crossings as a side effect of dependency lengths , 2015, Complex..

[16]  David J. Weir,et al.  Dependency Parsing Schemata and Mildly Non-Projective Dependency Parsing , 2011, CL.

[17]  Ramon Ferrer-i-Cancho,et al.  Non-crossing dependencies: least effort, not grammar , 2014, ArXiv.

[18]  Stanley Peters,et al.  Cross-Serial Dependencies in Dutch , 1982 .

[19]  R. F. Cancho Euclidean distance between syntactically linked words. , 2004 .

[20]  Hidekazu Tanaka Invisible Movement in Sika-Nai and the Linear Crossing Constraint , 1997 .

[21]  Timm Lichte,et al.  Characterizing Discontinuity in Constituent Treebanks , 2009, FG.

[22]  Luke McCrohon,et al.  The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11) , 2016 .

[23]  Alexander Mehler,et al.  Towards a Theoretical Framework for Analyzing Complex Linguistic Networks , 2015 .

[24]  D. G. Hays Dependency Theory: A Formalism and Some Observations , 1964 .

[25]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[26]  Songwook Lee A Statistical Model for Identifying Grammatical Relations in Korean Sentences , 2004, IEICE Trans. Inf. Syst..

[27]  Aravind K. Joshi,et al.  A dependency perspective on the adequacy of tree local multi-component tree adjoining grammar , 2014, J. Log. Comput..

[28]  Joakim Nivre,et al.  Mildly Non-Projective Dependency Structures , 2006, ACL.

[29]  I. Cahit,et al.  A Conjectured Minimum Valuation Tree , 1977 .

[30]  Daniel Gildea,et al.  Do Grammars Minimize Dependency Length? , 2010, Cogn. Sci..

[31]  David J. Weir,et al.  A Deductive Approach to Dependency Parsing , 2008, ACL.

[32]  Daniel Zeman,et al.  HamleDT: To Parse or Not to Parse? , 2012, LREC.

[33]  G. Miller,et al.  Cognitive science. , 1981, Science.

[34]  Nick Cercone,et al.  Computational Linguistics , 1986, Communications in Computer and Information Science.

[35]  R. Ferrer i Cancho Why do syntactic links not cross , 2006 .

[36]  Aravind K. Joshi,et al.  Unavoidable Ill-nestedness in Natural Language and the Adequacy of Tree Local-MCTAG Induced Dependency Structures , 2010, TAG.

[37]  Haitao Liu,et al.  Dependency Distance as a Metric of Language Comprehension Difficulty , 2008 .

[38]  Haitao Liu,et al.  The risks of mixing dependency lengths from sequences of different length , 2013, ArXiv.

[39]  M. Brainin Cognition , 1999, Journal of the Neurological Sciences.

[40]  Fermín Moscoso del Prado Martín,et al.  Word order in a grammarless language: A 'small-data' information-theoretic approach , 2015, CogSci.

[41]  E. Todeva Networks , 2007 .

[42]  Joakim Nivre,et al.  Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.

[43]  J. Oberlander,et al.  Proceedings of the COLING/ACL on Main Conference Poster Sessions , 2006 .

[44]  Sampath Kannan,et al.  Finding Optimal 1-Endpoint-Crossing Trees , 2013, TACL.

[45]  Sampath Kannan,et al.  Dynamic Programming for Higher Order Parsing of Gap-Minding Trees , 2012, EMNLP.

[46]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[47]  Ramon Ferrer-i-Cancho,et al.  Hubiness, length, crossings and their relationships in dependency trees , 2013, ArXiv.

[48]  Morten H. Christiansen,et al.  Processing multiple non-adjacent dependencies: evidence from sequence learning , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[49]  Laura Kallmeyer,et al.  Parsing Beyond Context-Free Grammars , 2010, Cognitive Technologies.

[50]  Giorgio Satta,et al.  Exact Inference for Generative Probabilistic Non-Projective Dependency Parsing , 2011, EMNLP.

[51]  Proceedings of the 10th Annual Meeting , 1960 .

[52]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[53]  J. Urry Complexity , 2006, Interpreting Art.

[54]  Physical Review , 1965, Nature.

[55]  Tamar Frankel [The theory and the practice...]. , 2001, Tijdschrift voor diergeneeskunde.

[56]  Joakim Nivre,et al.  Divisible Transition Systems and Multiplanar Dependency Parsing , 2013, CL.

[57]  Noah A. Smith,et al.  Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers , 2013, ACL.

[58]  O. Bagasra,et al.  Proceedings of the National Academy of Sciences , 1914, Science.

[59]  G. Carpenter,et al.  Behavioral and Brain Sciences , 1999 .

[60]  Giuseppe Attardi,et al.  Experiments with a Multilanguage Non-Projective Dependency Parser , 2006, CoNLL.

[61]  Reuven Cohen,et al.  Complex Networks: Structure, Robustness and Function , 2010 .

[62]  Edward Gibson,et al.  The processing of extraposed structures in English , 2012, Cognition.

[63]  Walid G. Aref,et al.  Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems , 2008, GIS 2008.

[64]  Ramon Ferrer-i-Cancho,et al.  Why SOV might be initially preferred and then lost or recovered? A theoretical framework , 2013, ArXiv.

[65]  Jirí Havelka Beyond Projectivity: Multilingual Evaluation of Constraints and Measures on Non-Projective Structures , 2007, ACL.

[66]  Yannick Versley,et al.  Experiments with Easy-first nonprojective constituent parsing , 2014 .

[67]  Rudolf Rosa,et al.  HamleDT 2.0: Thirty Dependency Treebanks Stanfordized , 2014, LREC.

[68]  Nick Chater,et al.  The Now-or-Never bottleneck: A fundamental constraint on language , 2015, Behavioral and Brain Sciences.

[69]  Ramon Ferrer-i-Cancho,et al.  Random crossings in dependency trees , 2013, Glottometrics.

[70]  Joakim Nivre,et al.  Squibs: Going to the Roots of Dependency Parsing , 2013, CL.

[71]  Daniel Gildea,et al.  Optimal Parsing Strategies for Linear Context-Free Rewriting Systems , 2010, NAACL.

[72]  Richard Hudson,et al.  Language Networks: The New Word Grammar , 2007 .

[73]  D. Saad Europhysics Letters , 1997 .

[74]  Ondrej Dusek,et al.  HamleDT: Harmonized multi-language dependency treebank , 2014, Lang. Resour. Evaluation.

[75]  Jennifer Culbertson,et al.  Proceedings of the 35th Annual Meeting of the Cognitive Science Society , 2013 .

[76]  Richard Futrell,et al.  Large-scale evidence of dependency length minimization in 37 languages , 2015, Proceedings of the National Academy of Sciences.

[77]  Henry Ford,et al.  I. Introduction , 2007 .

[78]  Andrew McCallum,et al.  Transition-based Dependency Parsing with Selectional Branching , 2013, ACL.

[79]  Giorgio Satta,et al.  On the Complexity of Non-Projective Data-Driven Dependency Parsing , 2007, IWPT.

[80]  David R. Anderson,et al.  Model Selection and Multimodel Inference , 2003 .

[81]  Mirella Lapata,et al.  ACL 2007, Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, June 23-30, 2007, Prague, Czech Republic , 2007, ACL.

[82]  Robert Forkel,et al.  The World Atlas of Language Structures Online , 2009 .

[83]  D. Wilkin,et al.  Neuron , 2001, Brain Research.

[84]  Giorgio Parisi,et al.  Physica A: Statistical Mechanics and its Applications: Editorial note , 2005 .

[85]  M. Mézard,et al.  Journal of Statistical Mechanics: Theory and Experiment , 2011 .