Using Online, Crowdsourcing Platforms for Data Collection in Personality Disorder Research: The Example of Amazon’s Mechanical Turk

The use of crowdsourcing platforms such as Amazon’s Mechanical Turk (MTurk) for data collection in the behavioral sciences has increased substantially in the past several years due in large part to (a) the ability to recruit large samples, (b) the inexpensiveness of data collection, (c) the speed of data collection, and (d) evidence that the data collected are, for the most part, of equal or better quality to that collected in undergraduate research pools. In this review, we first evaluate the strengths and potential limitations of this approach to data collection. Second, we examine how MTurk has been used to date in personality disorder (PD) research and compare the characteristics of such research to PD research conducted in other settings. Third, we compare PD trait data from the Section III trait model of the DSM–5 collected via MTurk to data collected using undergraduate and clinical samples with regard to internal consistency, mean-level differences, and factor structure. Overall, we conclude that platforms such as MTurk have much to offer PD researchers, especially for certain kinds of research (e.g., where large samples are required and there is a need for iterative sampling). Whether MTurk itself remains the predominant model of such platforms is unclear, however, and will largely depend on decisions related to cost effectiveness and the development of alternatives that offer even greater flexibility.

[1]  K. Timpano,et al.  The importance of assessing clinical phenomena in Mechanical Turk research. , 2016, Psychological assessment.

[2]  D. Lynam,et al.  Grandiose and vulnerable narcissism from the perspective of the interpersonal circumplex , 2012 .

[3]  Laura Smart Richman,et al.  An online daily diary study of alcohol use using Amazon's Mechanical Turk. , 2014, Drug and alcohol review.

[4]  D. O. Sears College sophomores in the laboratory: Influences of a narrow data base on social psychology's view of human nature. , 1986 .

[5]  Amar Cheema,et al.  Data collection in a flat world: the strengths and weaknesses of mechanical turk samples , 2013 .

[6]  Panagiotis G. Ipeirotis,et al.  Running Experiments on Amazon Mechanical Turk , 2010, Judgment and Decision Making.

[7]  Leib Litman,et al.  The relationship between motivation, monetary compensation, and data quality among US- and India-based workers on Mechanical Turk , 2014, Behavior Research Methods.

[8]  Adam J. Berinsky,et al.  Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk , 2012, Political Analysis.

[9]  Jesse Chandler,et al.  Using Mechanical Turk to Study Clinical Populations , 2013 .

[10]  D. Johnson,et al.  Participants at Your Fingertips , 2012 .

[11]  A. Acquisti,et al.  Reputation as a sufficient condition for data quality on Amazon Mechanical Turk , 2013, Behavior Research Methods.

[12]  Kate A. Ratliff,et al.  Using Nonnaive Participants Can Reduce Effect Sizes , 2015, Psychological science.

[13]  Ben R. Newell,et al.  The average laboratory samples a population of 7,300 Amazon Mechanical Turk workers , 2015, Judgment and Decision Making.

[14]  Michael D. Buhrmester,et al.  Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[15]  Felix D. Schönbrodt,et al.  At what sample size do correlations stabilize , 2013 .

[16]  Winter A. Mason,et al.  Internet research in psychology. , 2015, Annual review of psychology.

[17]  Richard N. Landers,et al.  An Inconvenient Truth: Arbitrary Distinctions Between Organizational, Mechanical Turk, and Other Convenience Samples , 2015, Industrial and Organizational Psychology.

[18]  Jesse J. Chandler,et al.  Inside the Turk , 2014 .

[19]  Matthew Lease,et al.  Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms , 2015 .

[20]  Nathan T. Carter,et al.  Testing whether the DSM-5 personality disorder trait model can be measured with a reduced set of items: An item response theory investigation of the Personality Inventory for DSM-5. , 2015, Psychological assessment.

[21]  J. MacKillop,et al.  The Five-Factor Narcissism Inventory (FFNI): a test of the convergent, discriminant, and incremental validity of FFNI scores in clinical and community samples. , 2013, Psychological assessment.

[22]  T. Yarkoni,et al.  Using a genetic algorithm to abbreviate the Psychopathic Personality Inventory-Revised (PPI-R). , 2015, Psychological assessment.

[23]  Michael L. Crowe,et al.  Identifying Two Groups of Entitled Individuals: Cluster Analysis Reveals Emotional Stability and Self-Esteem Distinction. , 2016, Journal of personality disorders.

[24]  T. Widiger,et al.  Model Rating Form An Investigation of the Factor Structure and Convergent and Discriminant Validity of the Five-Factor , 2012 .

[25]  Daniel A. Newman,et al.  Crowdsourcing and personality measurement equivalence: A warning about countries whose primary language is not English , 2015 .

[26]  Robert Kosara,et al.  Do Mechanical Turks dream of square pie charts? , 2010, BELIV '10.

[27]  Christopher J. Holden,et al.  Assessing the reliability of the M5-120 on Amazon's mechanical Turk , 2013, Comput. Hum. Behav..

[28]  David J. Hauser,et al.  It’s a Trap! Instructional Manipulation Checks Prompt Systematic Thinking on “Tricky” Tasks , 2015 .

[29]  Travis Simcox,et al.  Collecting response times using Amazon Mechanical Turk and Adobe Flash , 2013, Behavior Research Methods.

[30]  Assessing the Reliability of . . . , 2022 .

[31]  W. Fleeson,et al.  Does assessing suicidality frequently and repeatedly cause harm? A randomized control study. , 2015, Psychological assessment.

[32]  D. Watson,et al.  Constructing validity: Basic issues in objective scale development , 1995 .

[33]  Jessica L Maples,et al.  A test of two brief measures of the dark triad: the dirty dozen and short dark triad. , 2014, Psychological assessment.

[34]  T. Widiger,et al.  Convergent and Discriminant Validity of the Five Factor Form , 2014, Assessment.

[35]  Daniel N. Jones What's mine is mine and what's yours is mine: The Dark Triad and gambling with your neighbor's money , 2013 .

[36]  Thomas A Widiger,et al.  A meta-analytic review of the relationships between the five-factor model and DSM-IV-TR personality disorders: a facet level analysis. , 2008, Clinical psychology review.

[37]  Justin A. DeSimone,et al.  Caution! MTurk Workers Ahead—Fines Doubled , 2015, Industrial and Organizational Psychology.

[38]  Lydia B. Chilton,et al.  The labor economics of paid crowdsourcing , 2010, EC '10.

[39]  J. Veilleux,et al.  Negative affect intensity influences drinking to cope through facets of emotion dysregulation , 2014 .

[40]  David J. Hauser,et al.  Attentive Turkers: MTurk participants perform better on online attention checks than do subject pool participants , 2015, Behavior Research Methods.

[41]  F. Schmidt,et al.  Linking "big" personality traits to anxiety, depressive, and substance use disorders: a meta-analysis. , 2010, Psychological bulletin.

[42]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[43]  Todd M. Gureckis,et al.  Evaluating Amazon's Mechanical Turk as a Tool for Experimental Behavioral Research , 2013, PloS one.

[44]  Paul A. Moore What’s Mine Is Mine and What’s Yours Is Yours , 2019, Into the Illusive World.

[45]  Nathan T. Carter,et al.  The Downsides of Extreme Conscientiousness for Psychological Well-being: The Role of Obsessive Compulsive Tendencies. , 2016, Journal of personality.

[46]  Benjamin E Hilbig,et al.  Reaction time effects in lab- versus Web-based research: Experimental evidence , 2016, Behavior research methods.

[47]  Alessandro Acquisti,et al.  Beyond the Turk: An Empirical Comparison of Alternative Platforms for Crowdsourcing Online Behavioral Research , 2016 .

[48]  T. Widiger,et al.  Assessment of dependency by the FFDI: Comparisons to the PID-5 and maladaptive agreeableness. , 2015, Personality and mental health.

[49]  Delroy L. Paulhus,et al.  The role of impulsivity in the Dark Triad of personality , 2011 .

[50]  Sterett H. Mercer,et al.  The psychology of spite and the measurement of spitefulness. , 2014, Psychological assessment.

[51]  Tara S. Behrend,et al.  The viability of crowdsourcing for survey research , 2011, Behavior research methods.

[52]  Katherine S. Corker,et al.  College Student Samples Are Not Always Equivalent: The Magnitude of Personality Differences Across Colleges and Universities. , 2017, Journal of personality.

[53]  Jesse Chandler,et al.  Conducting Clinical Research Using Crowdsourced Convenience Samples. , 2016, Annual review of clinical psychology.

[54]  R. Krueger,et al.  Ten aspects of the Big Five in the Personality Inventory for DSM-5. , 2016, Personality disorders.

[55]  Daniel N. Jones,et al.  Introducing the Short Dark Triad (SD3) , 2014, Assessment.

[56]  Jesse Chandler,et al.  Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers , 2013, Behavior Research Methods.

[57]  D. Watson,et al.  Initial construction of a maladaptive personality trait model and inventory for DSM-5 , 2011, Psychological Medicine.