Corrigendum: Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results

Twenty-nine teams involving 61 analysts used the same data set to address the same research question: whether soccer referees are more likely to give red cards to dark-skin-toned players than to light-skin-toned players. Analytic approaches varied widely across the teams, and the estimated effect sizes ranged from 0.89 to 2.93 ( Mdn = 1.31) in odds-ratio units. Twenty teams (69%) found a statistically significant positive effect, and 9 teams (31%) did not observe a significant relationship. Overall, the 29 different analyses used 21 unique combinations of covariates. Neither analysts’ prior beliefs about the effect of interest nor their level of expertise readily explained the variation in the outcomes of the analyses. Peer ratings of the quality of the analyses also did not account for the variability. These findings suggest that significant variation in the results of analyses of complex data may be difficult to avoid, even by experts with honest intentions. Crowdsourcing data analysis, a strategy in which numerous research teams are recruited to simultaneously investigate the same research question, makes transparent how defensible, yet subjective, analytic choices influence research results.

[1]  Mathew H. Evans,et al.  Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results , 2018, Advances in Methods and Practices in Psychological Science.

[2]  Christopher D. Chambers,et al.  Redefine statistical significance , 2017, Nature Human Behaviour.

[3]  Brian A. Nosek,et al.  Many Labs 3: Evaluating participant pool quality across the academic semester via replication , 2016 .

[4]  Francis Tuerlinckx,et al.  Increasing Transparency Through a Multiverse Analysis , 2016, Perspectives on psychological science : a journal of the Association for Psychological Science.

[5]  Leif D. Nelson,et al.  Specification Curve: Descriptive and Inferential Statistics on All Reasonable Specifications , 2015 .

[6]  Raphael Silberzahn,et al.  Crowdsourced research: Many hands make tight work , 2015, Nature.

[7]  Michael P H Stumpf,et al.  Topological sensitivity analysis for systems biology , 2014, Proceedings of the National Academy of Sciences.

[8]  M. Biernat,et al.  Analytic Review as a Solution to the Misreporting of Statistical Results in Psychological Science , 2014, Perspectives on psychological science : a journal of the Association for Psychological Science.

[9]  Jerry W. Kim,et al.  1 " " SEEING STARS : MATTHEW EFFECTS AND STATUS BIAS IN MAJOR LEAGUE BASEBALL UMPIRING , 2013 .

[10]  Kristian Thorlund,et al.  Reanalyses of randomized clinical trial data. , 2014, JAMA.

[11]  H. Krumholz,et al.  Open access to clinical trials data. , 2014, JAMA.

[12]  Reginald B. Adams,et al.  Investigating Variation in Replicability: A “Many Labs” Replication Project , 2014 .

[13]  Leif D. Nelson,et al.  P-Curve: A Key to the File Drawer , 2013, Journal of experimental psychology. General.

[14]  A. Gelman,et al.  The statistical crisis in science , 2014 .

[15]  V. Johnson Revised standards for statistical evidence , 2013, Proceedings of the National Academy of Sciences.

[16]  U. Simonsohn Just Post It: The Lesson from Two Cases of Fabricated Data Detected by Statistics Alone , 2013 .

[17]  G. Cumming The New Statistics: Why and How , 2013 .

[18]  J. Wicherts,et al.  The Rules of the Game Called Psychological Science , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[19]  Han L. J. van der Maas,et al.  Science Perspectives on Psychological an Agenda for Purely Confirmatory Research on Behalf Of: Association for Psychological Science , 2022 .

[20]  H. Beek F1000Prime recommendation of False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. , 2012 .

[21]  Joshua Carp,et al.  The secret lives of experiments: Methods reporting in the fMRI literature , 2012, NeuroImage.

[22]  Joshua Carp,et al.  On the Plurality of (Methodological) Worlds: Estimating the Analytic Flexibility of fMRI Experiments , 2012, Front. Neurosci..

[23]  Jane Hunter,et al.  Post-Publication Peer Review: Opening Up Scientific Conversation , 2012, Front. Comput. Neurosci..

[24]  Christopher A. Parsons,et al.  Strike Three: Discrimination, Incentives, and Evaluation , 2011 .

[25]  Larry V. Hedges,et al.  Converting Among Effect Sizes , 2009, Introduction to Meta‐Analysis.

[26]  A. Viera Odds ratios and risk ratios: what's the difference and why does it matter? , 2008, Southern medical journal.

[27]  D. Borsboom,et al.  The poor availability of psychological research data for reanalysis. , 2006, The American psychologist.

[28]  Teresa D. Harrison,et al.  Do Economics Journal Archives Promote Replicable Research? , 2006 .

[29]  Keith B. Maddox,et al.  Manipulating subcategory salience: exploring the link between skin tone and social perception of Blacks , 2004 .

[30]  C. Judd,et al.  The police officer's dilemma: using ethnicity to disambiguate potentially threatening individuals. , 2002, Journal of personality and social psychology.

[31]  Keith B. Maddox,et al.  Cognitive Representations of Black Americans: Reexploring the Role of Skin Tone , 2002 .

[32]  J. Sidanius,et al.  Inclusionary Discrimination: Pigmentocracy and Patriotism in the Dominican Republic , 2001 .

[33]  A. Guimarães Racism in a Racial Democracy: The Maintenance of White Supremacy in Brazil. , 1999, Transforming Anthropology.

[34]  G. Bodenhausen Stereotypic biases in social decision making and memory: testing process models of stereotype use. , 1988, Journal of personality and social psychology.

[35]  M. Frank,et al.  The dark side of self- and social perception: black uniforms and aggression in professional sports. , 1988, Journal of personality and social psychology.