Measuring Gradience in Speakers' Grammaticality Judgements

The question of whether grammaticality is a binary categorical or a gradient property has been the subject of ongoing debate in linguistics and psychology for many years. Linguists have tended to use constructed examples to test speakers’ judgements on specific sorts of constraint violation. We applied machine translation to randomly selected subsets of the British National Corpus (BNC) to generate a large test set which contains well-formed English source sentences, and sentences that exhibit a wide variety of grammatical infelicities. We tested a large number of speakers through (filtered) crowd sourcing, with three distinct modes of classification, one binary and two ordered scales. We found a high degree of correlation in mean judgements for sentences across the three classification tasks. We also did two visual image classification tasks to obtain benchmarks for binary and gradient judgement patterns, respectively. Finally, we did a second crowd source experiment on 100 randomly selected linguistic textbook example sentences. The sentence judgement distributions for individual speakers strongly resemble the gradience benchmark pattern. This evidence suggests that speakers represent grammatical well-formedness as a gradient property.

[1]  Sharon Lee Armstrong,et al.  What some concepts might not be , 1983, Cognition.

[2]  David Adger,et al.  Core Syntax: A Minimalist Approach , 2003 .

[3]  Jon Sprouse A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory , 2010, Behavior research methods.

[4]  Jon Sprouse,et al.  Assessing the reliability of textbook data in syntax: Adger's Core Syntax1 , 2012, Journal of Linguistics.

[5]  Judy B. Bernstein,et al.  Data and grammar: Means and individuals , 2007 .

[6]  Frank Keller,et al.  Gradience in Grammar: Experimental and Computational Aspects of Degrees of Grammaticality , 2001 .

[7]  M. Schlesewsky,et al.  Gradience in grammar : generative perspectives , 2006 .

[8]  Jon Sprouse Continuous Acceptability, Categorical Grammaticality, and Experimental Syntax , 2007, Biolinguistics.

[9]  Antonella Sorace,et al.  Gradience in Linguistic Data , 2005 .

[10]  Christopher D. Manning,et al.  Probabilistic Syntax , 2002 .

[11]  Frank Keller,et al.  Probabilistic Grammars as Models of Gradience in Language Processing , 2006 .

[12]  Noam Chomsky,et al.  Aspects of the Theory of Syntax. , 1966 .

[13]  Shalom Lappin,et al.  Linguistic Nativism and the Poverty of the Stimulus , 2011 .

[14]  B. Ambridge,et al.  Semantics versus statistics in the retreat from locative overgeneralization errors , 2012, Cognition.

[15]  Alexander Clark,et al.  Statistical Representation of Grammaticality Judgements: the Limits of N-Gram Models , 2013, CMCL.

[16]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[17]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.