Bootstrapping in Applied Linguistics: Assessing its Potential Using Shared Data

Parametric analyses such as t tests and ANOVAs are the norm—if not the default—statistical tests found in quantitative applied linguistics research (Gass 2009). Applied statisticians and one applied linguist (Larson-Hall 2010, 2012; Larson-Hall and Herrington 2010), however, have argued that this approach may not be appropriate for small samples and/or nonnormally distributed data (e.g. Wilcox 2003), both common in second language (L2) research. They recommend instead ‘robust statistics’ such as bootstrapping, a nonparametric procedure that randomly resamples from an observed data set to produce a simulated but more stable and statistically accurate outcome. The present study tests the usefulness of bootstrapping by reanalyzing raw data from 26 studies of applied linguistics research. Our results found no evidence of Type II error (false negative). However, 4 out of 16 statistically significant results were not replicated (i.e. a Type I error ‘misfit’ five times higher than an alpha of .05). We discuss empirically justified suggestions for the use of bootstrapping in the context of broader methodological issues and reforms in applied linguistics (see Plonsky 2013, 2014).

[1]  A. Gelman,et al.  Of Beauty, Sex and Power , 2009 .

[2]  S. Loewen,et al.  Statistical Literacy Among Applied Linguists and Second Language Acquisition Researchers , 2014 .

[3]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[4]  James Algina,et al.  A generally robust approach for testing hypotheses and setting confidence intervals for effect sizes. , 2008, Psychological methods.

[5]  R. Wilcox Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy , 2001 .

[6]  James B. Boyer,et al.  An Editorial Statement , 1981, Annals of the History of Computing.

[7]  Frances P Lawrenz,et al.  The Precision of Data Obtained in Large-Scale Science Assessments: An Investigation of Bootstrapping and Half-Sample Replication Methods. , 1998 .

[8]  Laura L. Lansing Bootstrapping versus the student's t : the problems of Type I error and power , 1999 .

[9]  Luke D Plonsky,et al.  Study quality in quantitative l2 research (1990-2010): A methodological synthesis and call for reform , 2014 .

[10]  John B. Willett,et al.  By Design: Planning Research on Higher Education , 1990 .

[11]  F. Yu,et al.  Multivariate nonparametric techniques for astigmatism analysis , 2010, Journal of cataract and refractive surgery.

[12]  S. Gass,et al.  Quantitative Research Methods, Study Quality, and Outcomes: The Case of Interaction Research , 2011 .

[13]  J. Tukey The Philosophy of Multiple Comparisons , 1991 .

[14]  P. Allotey,et al.  Data sharing in medical research: an empirical investigation. , 2001, Bioethics.

[15]  Frederick L. Oswald,et al.  Meta-analysis in Second Language Research: Choices and Challenges , 2010, Annual Review of Applied Linguistics.

[16]  William Howard Beasley,et al.  Bootstrapping to test for nonzero population correlation coefficients using univariate sampling. , 2007, Psychological methods.

[17]  Link,et al.  Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results , 2014 .

[18]  G. Crookes,et al.  Power, Effect Size, and Second Language Research A Researcher Comments… , 1991 .

[19]  Mark Kane,et al.  Chapter 17 – Analysing quantitative data , 2004 .

[20]  Kenneth J. Meier Replication: A View From the Streets , 1995 .

[21]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[22]  Frederick L. Oswald,et al.  How to do a Meta‐Analysis , 2012 .

[23]  Luke Plonsky,et al.  The Effectiveness of Second Language Pronunciation Instruction: A Meta-analysis , 2015 .

[24]  P. Good Resampling Methods , 1999, Birkhäuser Boston.

[25]  Olivier Gascuel,et al.  On the Interpretation of Bootstrap Trees: Appropriate Threshold of Clade Selection and Induced Gain , 1996 .

[26]  Jenifer Larson-Hall,et al.  Our statistical intuitions may be misleading us: Why we need robust statistics , 2011, Language Teaching.

[27]  Richard E. Lucas,et al.  Improving the replicability and reproducibility of research published in the Journal of Research in Personality , 2013 .

[28]  Graeme Keith Porte,et al.  Replication research in applied linguistics , 2012 .

[29]  Sebastián M. Real,et al.  E2F1 Regulates Cellular Growth by mTORC1 Signaling , 2011, PloS one.

[30]  Jenifer Larson-Hall,et al.  A Guide to Doing Statistics in Second Language Research Using SPSS , 2009 .

[31]  Stephen Reder,et al.  The Multimedia Adult ESL Learner Corpus , 2003 .

[32]  D. Borsboom,et al.  The poor availability of psychological research data for reanalysis. , 2006, The American psychologist.

[33]  Andy Field,et al.  Discovering statistics using SPSS, 2nd ed. , 2005 .

[34]  Graham Crookes Another Researcher Comments , 1991 .

[35]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[36]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[37]  E. Ziegel,et al.  Bootstrapping: A Nonparametric Approach to Statistical Inference , 1993 .

[38]  A. Gelman,et al.  Of Beauty , Sex and Power Too little attention has been paid to the statistical challenges in estimating small effects , 2022 .

[39]  Anne Lazaraton,et al.  Quantitative Research Methods , 2005 .

[40]  Steven P. Abney,et al.  Bootstrapping , 2002, ACL.

[41]  J. Tukey A survey of sampling from contaminated distributions , 1960 .

[42]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[43]  Glenn Firebaugh Replication Data Sets and Favored-Hypothesis Bias , 2007 .

[44]  Jenifer Larson-Hall,et al.  Improving Data Analysis in Second Language Acquisition by Utilizing Modern Developments in Applied Statistics , 2010 .

[45]  Norman Kaplan,et al.  The Sociology of Science: Theoretical and Empirical Investigations , 1974 .

[46]  Luke Plonsky,et al.  SYSTEMATIC REVIEW ARTICLE The Effectiveness of Second Language Strategy Instruction: A Meta-analysis , 2011 .

[47]  Francesco Nocera,et al.  Resampling approach to statistical inference: Bootstrapping from event-related potentials data , 2000, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[48]  Michael R Chernick,et al.  Bootstrap Methods: A Guide for Practitioners and Researchers , 2007 .

[49]  Luke Plonsky,et al.  How Big Is “Big”? Interpreting Effect Sizes in L2 Research , 2014 .

[50]  Luke Plonsky,et al.  STUDY QUALITY IN SLA , 2013, Studies in Second Language Acquisition.

[51]  J. Ioannidis Why Most Discovered True Associations Are Inflated , 2008, Epidemiology.

[52]  E. Wolfe,et al.  Comparison of Asymptotic and Bootstrap Item Fit Indices in Identifying Misfit to the Rasch Model National Conference on Measurement in Education , 2011 .

[53]  R. Wilcox Applying Contemporary Statistical Techniques , 2003 .

[54]  Sally Sieloff Magnan From the editor: The MLJ tradition and the challenges ahead , 1994 .

[55]  Won-Chan Lee,et al.  Bootstrapping correlation coefficients using univariate and bivariate sampling. , 1998 .

[56]  J. Norris,et al.  Effectiveness of L2 Instruction: A Research Synthesis and Quantitative Meta‐analysis , 2000 .

[57]  Lyle F. Bachman Statistical analyses for language assessment , 2004 .