Assessing the reliability of textbook data in syntax: Adger's Core Syntax1

There has been a consistent pattern of criticism of the reliability of acceptability judgment data in syntax for at least 50 years (e.g., Hill 1961), culminating in several high-profile criticisms within the past ten years (Edelman & Christiansen 2003, Ferreira 2005, Wasow & Arnold 2005, Gibson & Fedorenko 2010, in press). The fundamental claim of these critics is that traditional acceptability judgment collection methods, which tend to be relatively informal compared to methods from experimental psychology, lead to an intolerably high number of false positive results. In this paper we empirically assess this claim by formally testing all 469 (unique, US-English) data points from a popular syntax textbook (Adger 2003) using 440 naïve participants, two judgment tasks (magnitude estimation and yes–no), and three different types of statistical analyses (standard frequentist tests, linear mixed effects models, and Bayes factor analyses). The results suggest that the maximum discrepancy between traditional methods and formal experimental methods is 2%. This suggests that even under the (likely unwarranted) assumption that the discrepant results are all false positives that have found their way into the syntactic literature due to the shortcomings of traditional methods, the minimum replication rate of these 469 data points is 98%. We discuss the implications of these results for questions about the reliability of syntactic data, as well as the practical consequences of these results for the methodological options available to syntacticians.

[1]  L. M. M.-T. Theory of Probability , 1929, Nature.

[2]  S. S. Stevens On the psychophysical law. , 1957, Psychological review.

[3]  Jacob Cohen,et al.  The statistical power of abnormal-social psychological research: a review. , 1962, Journal of abnormal and social psychology.

[4]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[5]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[6]  N. J. Spencer,et al.  Differences between linguists and nonlinguists in intuitions of grammaticality-acceptability , 1973, Journal of psycholinguistic research.

[7]  Noam Chomsky,et al.  The Logical Structure of Linguistic Theory , 1975 .

[8]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[9]  Edward Gibson,et al.  A computational theory of human linguistic processing: memory limitations and processing breakdown , 1991 .

[10]  A. Sorace,et al.  MAGNITUDE ESTIMATION OF LINGUISTIC ACCEPTABILITY , 1996 .

[11]  Wayne Cowart,et al.  Experimental Syntax: Applying Objective Methods to Sentence Judgments , 1997 .

[12]  Carson T. Schütze The empirical base of linguistics: Grammaticality judgments and linguistic methodology , 1998 .

[13]  T. Vance,et al.  Japanese/Korean linguistics , 1999 .

[14]  R. Nickerson,et al.  Null hypothesis significance testing: a review of an old and continuing controversy. , 2000, Psychological methods.

[15]  Frank Keller,et al.  Gradience in Grammar: Experimental and Computational Aspects of Degrees of Grammaticality , 2001 .

[16]  Colin Phillips,et al.  Linguistics and empirical evidence Reply to Edelman and Christiansen , 2003, Trends in Cognitive Sciences.

[17]  Morten H. Christiansen,et al.  How seriously should we take Minimalist syntax? , 2003, Trends in Cognitive Sciences.

[18]  Frank Keller,et al.  A Psychophysical Law for Linguistic Judgments , 2003 .

[19]  David Adger,et al.  Core Syntax: A Minimalist Approach , 2003 .

[20]  Sam Featherston,et al.  Magnitude estimation and what it can do for your syntax: some wh-constraints in German , 2005 .

[21]  F. Ferreira Psycholinguistics, formal grammars, and cognitive science , 2005 .

[22]  Sam Featherston,et al.  Universals and grammaticality: wh-constraints in German and English , 2005 .

[23]  Antonella Sorace,et al.  Gradience in Linguistic Data , 2005 .

[24]  Thomas Wasow,et al.  Intuitions in linguistic argumentation , 2005 .

[25]  Frank Keller,et al.  Locality, Cyclicity, and Resumption: At the Interface between the Grammar and the Human Sentence Processor , 2007 .

[26]  Sam Featherston,et al.  Data in generative grammar: The stick and the carrot , 2007 .

[27]  Günther Grewendorf Empirical evidence and theoretical reasoning in generative grammar , 2007 .

[28]  Bridget Samuels,et al.  On Evolutionary Phonology , 2007, Biolinguistics.

[29]  Frederick J. Newmeyer Commentary on Sam Featherston, ‘Data in generative grammar: The stick and the carrot‘ , 2007 .

[30]  Jon Sprouse Continuous Acceptability, Categorical Grammaticality, and Experimental Syntax , 2007, Biolinguistics.

[31]  Hubert Haider,et al.  As a matter of facts – comments on Featherston's sticks and carrots , 2007 .

[32]  Gisbert Fanselow,et al.  Carrots – perfect as vegetables, but please not as a main dish , 2007 .

[33]  Judy B. Bernstein,et al.  Data and grammar: Means and individuals , 2007 .

[34]  M. Heft,et al.  Connectedness , 2007, Journal of dental research.

[35]  R. Baayen,et al.  Mixed-effects modeling with crossed random effects for subjects and items , 2008 .

[36]  Jon Sprouse The Differential Sensitivity of Acceptability Judgments to Processing Effects , 2008, Linguistic Inquiry.

[37]  Patrick Dattalo,et al.  Statistical Power Analysis , 2008 .

[38]  James Myers,et al.  Syntactic Judgment Experiments , 2009, Lang. Linguistics Compass.

[39]  C. Gallistel,et al.  The Importance of Proving the Null , 2022 .

[40]  Sam Featherston Relax, lean back, and be a linguist , 2009 .

[41]  J. Raaijmakers,et al.  How to quantify support for and against the null hypothesis: A flexible WinBUGS implementation of a default Bayesian t test , 2009, Psychonomic bulletin & review.

[42]  Jeffrey N. Rouder,et al.  Bayesian t tests for accepting and rejecting the null hypothesis , 2009, Psychonomic bulletin & review.

[43]  Jennifer Culbertson,et al.  Are Linguists Better Subjects? , 2009, The British Journal for the Philosophy of Science.

[44]  Jon Sprouse,et al.  Revisiting Satiation: Evidence for an Equalization Response Strategy , 2009, Linguistic Inquiry.

[45]  M. Bader,et al.  Toward a model of grammaticality judgments1 , 2009, Journal of Linguistics.

[46]  Elisabeth Dévière,et al.  Analyzing linguistic data: a practical introduction to statistics using R , 2009 .

[47]  E. Gibson,et al.  Weak quantitative standards in linguistics research , 2010, Trends in Cognitive Sciences.

[48]  E. Dąbrowska Naive v. expert intuitions: An empirical study of acceptability judgments , 2010 .

[49]  Peter W. Culicover,et al.  Quantitative methods alone are not enough: Response to Gibson and Fedorenko , 2010, Trends in Cognitive Sciences.

[50]  Edward Gibson,et al.  Using Mechanical Turk to Obtain and Analyze English Acceptability Judgments , 2011, Lang. Linguistics Compass.

[51]  Jennifer Culbertson,et al.  Revisited Linguistic Intuitions , 2011, The British Journal for the Philosophy of Science.

[52]  Jon Sprouse,et al.  A Test of the Cognitive Assumptions of Magnitude Estimation: Commutativity does not Hold for Acceptability Judgments , 2011 .

[53]  Jon Sprouse A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory , 2010, Behavior research methods.

[54]  Hajime Ono,et al.  Reverse Island Effects and the Backward Search for a Licensor in Multiple Wh-Questions , 2011 .

[55]  G. Fanselow,et al.  On the Informativity of Different Measures of Linguistic Acceptability , 2011 .

[56]  Kleanthes K. Grohmann,et al.  The Cambridge handbook of biolinguistics , 2012 .

[57]  Jon Sprouse,et al.  A test of the relation between working-memory capacity and syntactic island effects , 2012 .

[58]  Diogo Almeida,et al.  The Cambridge Handbook of Biolinguistics: The role of experimental syntax in an integrated cognitive science of language , 2013 .

[59]  Morten H. Christiansen,et al.  The need for quantitative methods in syntax and semantics research , 2013 .

[60]  Inbal Arnon,et al.  The source ambiguity problem : Distinguishing the effects of grammar and processing on acceptability judgments , 2011 .

[61]  Robert J. Podesva,et al.  Research methods in linguistics , 2013 .

[62]  C. Phillips Should we impeach armchair linguists ? , 2022 .

[63]  Jon Sprouse,et al.  Assessing the reliability of journal data in syntax : Linguistic Inquiry 2001-2010 , 2022 .