Measuring Linguistic Complexity: Introducing a New Categorial Metric

This paper provides a computable quantitative measure which accounts for the difficulty in human processing of sentences: why is a sentence harder to parse than another one? Why is some reading of a sentence easier than another one? We take for granted psycholinguistic results on human processing complexity like the ones by Gibson. We define a new metric which uses Categorial Proof Nets to correctly model Gibson’s account in his Dependency Locality Theory. The proposed metric correctly predicts some performance phenomena such as structures with embedded pronouns, garden paths, unacceptable center embeddings, preference for lower attachment and passive paraphrases acceptability. Our proposal gets closer to the modern computational psycholinguistic theories, while it opens the door to include semantic complexity, because of the straightforward syntax-semantics interface in categorial grammars.

[1]  Mehdi Mirzapour Finding Missing Categories in Incomplete Utterances , 2017, TALN.

[2]  Mathieu Lafourcade,et al.  Collecting Weighted Coercions from Crowd-Sourced Lexical Data for Compositional Semantic Analysis , 2017, JSAI-isAI Workshops.

[3]  Mark Johnson Proof Nets and the Complexity of Processing Center Embedded Constructions , 1998, J. Log. Lang. Inf..

[4]  Davide Catta,et al.  Quantifier Scoping and Semantic Preferences , 2017 .

[5]  Philippe Blache,et al.  A computational model for linguistic complexity , 2011, Biology, Computation and Linguistics.

[6]  C. Perfetti,et al.  Linguistic complexity and text comprehension : readability issues reconsidered , 1989 .

[7]  C. Retoré Calcul de Lambek et logique linéaire , 1996 .

[8]  Richard Moot,et al.  Natural Language Semantics and Computability , 2019, J. Log. Lang. Inf..

[9]  M. Nivat Fiftieth volume of theoretical computer science , 1988 .

[10]  M. Tanenhaus Afterword The impact of “The cognitive basis for linguistic structures” , 2013 .

[11]  J. Hayes Cognition and the development of language , 1970 .

[12]  Mehdi Mirzapour Modeling Preferences for Ambiguous Utterance Interpretations. (Modélisation de préférences pour l'interprétation d'énoncés ambigus) , 2018 .

[13]  Glyn Morrill,et al.  Incremental processing and acceptability , 2000, CL.

[14]  D. Hilbert Die logischen Grundlagen der Mathematik , 1922 .

[15]  S. Vasishth Quantifying Processing Difficulty in Human Language Processing Shravan , 2005 .

[16]  J. Lambek The Mathematics of Sentence Structure , 1958 .

[17]  Elizabeth Cooper-Martin,et al.  Measures of cognitive effort , 1994 .

[18]  E. Gibson Linguistic complexity: locality of syntactic dependencies , 1998, Cognition.

[19]  Noam Chomsky Some Concepts and Consequences of the Theory of Government and Binding , 1982 .

[20]  J. Kimball Seven principles of surface structure parsing in natural language , 1973 .

[21]  Richard Moot The Logic of Categorial Grammars: A deductive account of natural language syntax and semantics , 2012 .

[22]  E. Gibson The dependency locality theory: A distance-based theory of linguistic complexity. , 2000 .

[23]  John R Anderson,et al.  An integrated theory of the mind. , 2004, Psychological review.

[24]  Noam Chomsky,et al.  Aspects of the Theory of Syntax , 1970 .

[25]  Patrick Lincoln,et al.  Linear logic , 1992, SIGA.

[26]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[27]  Vera Demberg,et al.  Language and cognitive load in a dual task environment , 2013, CogSci.

[28]  Richard L. Lewis,et al.  An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval , 2005, Cogn. Sci..

[29]  Philippe Blache,et al.  Evaluating Language Complexity in Context : New Parameters for a Constraint-Based Model , 2011 .

[30]  Dirk Roorda Proof Nets for Lambek Calculus , 1992, J. Log. Comput..

[31]  Edward Gibson,et al.  A computational theory of human linguistic processing: memory limitations and processing breakdown , 1991 .