A streamlined approach to online linguistic surveys

More and more researchers in linguistics use large-scale experiments to test hypotheses about the data they research, in addition to more traditional informant work. In this paper we describe a new set of free, open-source tools that allow linguists to post studies online, turktools. These tools allow for the creation of a wide range of linguistic tasks, including grammaticality surveys, sentence completion tasks, and picture-matching tasks, allowing for easily implemented large-scale linguistic studies. Our tools further help streamline the design of such experiments and assist in the extraction and analysis of the resulting data. Surveys created using the tools described in this paper can be posted on Amazon’s Mechanical Turk service, a popular crowdsourcing platform that mediates between ‘Requesters’ who can post surveys online and ‘Workers’ who complete them. This allows many linguistic surveys to be completed within hours or days and at relatively low costs. Alternatively, researchers can host these randomized experiments on their own servers using a supplied server-side component.

[1]  Wayne Cowart,et al.  Experimental Syntax: Applying Objective Methods to Sentence Judgments , 1997 .

[2]  Marcel Adam Just,et al.  Paradigms and processes in reading comprehension , 1982 .

[3]  Frank Keller,et al.  Gradience in Grammar: Experimental and Computational Aspects of Degrees of Grammaticality , 2001 .

[4]  ALEC MARANTZ,et al.  Generative linguistics within the cognitive neuroscience of language , 2005 .

[5]  Jesse Chandler,et al.  Using Mechanical Turk to Study Clinical Populations , 2013 .

[6]  Jesse Snedeker,et al.  What Exactly do Numbers Mean? , 2013, Language learning and development : the official journal of the Society for Language Development.

[7]  Andrew Gelman,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2006 .

[8]  Jon Sprouse,et al.  Assessing the reliability of textbook data in syntax: Adger's Core Syntax1 , 2012, Journal of Linguistics.

[9]  Edward Gibson,et al.  Using Mechanical Turk to Obtain and Analyze English Acceptability Judgments , 2011, Lang. Linguistics Compass.

[10]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[11]  R. Baayen,et al.  Mixed-effects modeling with crossed random effects for subjects and items , 2008 .

[12]  Morten H. Christiansen,et al.  How seriously should we take Minimalist syntax? , 2003, Trends in Cognitive Sciences.

[13]  Walter L. Smith Probability and Statistics , 1959, Nature.

[14]  Martin Corley,et al.  Timing accuracy of Web experiments: A case study using the WebExp software package , 2009, Behavior research methods.

[15]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[16]  Jon Sprouse,et al.  Revisiting Satiation: Evidence for an Equalization Response Strategy , 2009, Linguistic Inquiry.

[17]  Yasutada Sudo,et al.  'Most' meanings are superlative. , 2011 .

[18]  Thomas Wasow,et al.  Intuitions in linguistic argumentation , 2005 .

[19]  Emmanuel Chemla,et al.  Experimental Evidence for Embedded Scalar Implicatures , 2011, J. Semant..

[20]  Diogo Almeida,et al.  The empirical status of data in syntax: A reply to Gibson and Fedorenko , 2013 .

[21]  D. Barr,et al.  Random effects structure for confirmatory hypothesis testing: Keep it maximal. , 2013, Journal of memory and language.

[22]  D. Terence Langendoen,et al.  Dative questions: A study in the relation of acceptability to grammaticality of an english sentence type , 1973 .

[23]  Wayne Cowart Doing experimental syntax: Bridging the gap between syntactic questions and well-designed questionnaires , 2012 .

[24]  David G. Rand,et al.  The online laboratory: conducting experiments in a real labor market , 2010, ArXiv.

[25]  F. Ferreira Psycholinguistics, formal grammars, and cognitive science , 2005 .

[26]  S. Gosling,et al.  Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires. , 2004, The American psychologist.

[27]  James Myers,et al.  The design and analysis of small-scale syntactic judgment experiments , 2009 .

[28]  Elisabeth Dévière,et al.  Analyzing linguistic data: a practical introduction to statistics using R , 2009 .

[29]  Jon Sprouse A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory , 2010, Behavior research methods.

[30]  Ulf-Dietrich Reips Standards for Internet-based experimenting. , 2002, Experimental psychology.

[31]  Panagiotis G. Ipeirotis Analyzing the Amazon Mechanical Turk marketplace , 2010, XRDS.

[32]  K. Bretonnel Cohen,et al.  Last Words: Amazon Mechanical Turk: Gold Mine or Coal Mine? , 2011, CL.

[33]  James Myers,et al.  Syntactic Judgment Experiments , 2009, Lang. Linguistics Compass.

[34]  K. Nakayama,et al.  Is the Web as good as the lab? Comparable performance from Web and lab in cognitive/perceptual experiments , 2012, Psychonomic Bulletin & Review.

[35]  Carson T. Schütze The empirical base of linguistics: Grammaticality judgments and linguistic methodology , 1998 .

[36]  Colin Phillips,et al.  Linguistics and empirical evidence Reply to Edelman and Christiansen , 2003, Trends in Cognitive Sciences.

[37]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[38]  Jesse Snedeker,et al.  Even more evidence for the emptiness of plurality: An experimental investigation of plural interpretation as a species of implicature , 2010 .

[39]  A. Sorace,et al.  MAGNITUDE ESTIMATION OF LINGUISTIC ACCEPTABILITY , 1996 .

[40]  G. Fanselow,et al.  On the Informativity of Different Measures of Linguistic Acceptability , 2011 .

[41]  E. Gibson,et al.  Weak quantitative standards in linguistics research , 2010, Trends in Cognitive Sciences.

[42]  G. Milsark Existential sentences in English , 1979 .

[43]  Michael D. Buhrmester,et al.  Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[44]  Peter W. Culicover,et al.  Quantitative methods alone are not enough: Response to Gibson and Fedorenko , 2010, Trends in Cognitive Sciences.

[45]  Adam J. Berinsky,et al.  Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk , 2012, Political Analysis.

[46]  Carson T. Schütze,et al.  A comparison of informal and formal acceptability judgments using a random sample from Linguistic Inquiry 2001--2010 , 2013 .

[47]  Victor Kuperman,et al.  Crowdsourcing and language studies: the new generation of linguistic data , 2010, Mturk@HLT-NAACL.

[48]  Todd M. Gureckis,et al.  CUNY Academic , 2016 .

[49]  Panagiotis G. Ipeirotis,et al.  Running Experiments on Amazon Mechanical Turk , 2010, Judgment and Decision Making.

[50]  Sam Featherston,et al.  Magnitude estimation and what it can do for your syntax: some wh-constraints in German , 2005 .

[51]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .