Comparability of Computer-Based and Paper-and-Pencil Testing in K–12 Reading Assessments

In recent years, computer-based testing (CBT) has grown in popularity, is increasingly being implemented across the United States, and will likely become the primary mode for delivering tests in the future. Although CBT offers many advantages over traditional paper-and-pencil testing, assessment experts, researchers, practitioners, and users have expressed concern about the comparability of scores between the two test administration modes. To help provide an answer to this issue, a meta-analysis was conducted to synthesize the administration mode effects of CBTs and paper-and-pencil tests on K—12 student reading assessments. Findings indicate that the administration mode had no statistically significant effect on K—12 student reading achievement scores. Four moderator variables—study design, sample size, computer delivery algorithm, and computer practice—made statistically significant contributions to predicting effect size. Three moderator variables—grade level, type of test, and computer delivery method—did not affect the differences in reading scores between test modes.

[1]  Sharon Frey,et al.  Running header: Reading rate validity The Validity of a Computerized Measure of Reading Rate , 2012 .

[2]  Identifiers California,et al.  Annual Meeting of the National Council on Measurement in Education , 1998 .

[3]  Mary Pommerich,et al.  Developing Computerized Versions of Paper-and-Pencil Tests: Mode Effects for Passage-Based Tests , 2004 .

[4]  Betsy Jane Becker,et al.  Synthesizing standardized mean‐change measures , 1988 .

[5]  Mark W. Lipsey,et al.  Practical Meta-Analysis , 2000 .

[6]  Xiao-Hua Zhou,et al.  Statistical Methods for Meta‐Analysis , 2008 .

[7]  W. Dunlap,et al.  Meta-Analysis of Experiments With Matched Groups or Repeated Measures Designs , 1996 .

[8]  Anne L. Harvey,et al.  The Equivalence of Scores from Automated and Conventional Educational and Psychological Tests: A Review of the Literature. College Board Report No. 88-8. , 1988 .

[9]  Edward H. Scissons Computer Administration of the California Psychological Inventory. , 1976 .

[10]  L. Hedges Distribution Theory for Glass's Estimator of Effect size and Related Estimators , 1981 .

[11]  Alan C. Bugbee,et al.  The Equivalence of Paper-and-Pencil and Computer-Based Testing. , 1996 .

[12]  Walter P. Vispoel How Review Options and Administration Modes Influence Scores on Computerized Vocabulary Tests. , 1992 .

[13]  T. Podrabsky,et al.  A DIF Analysis of Item-Level Mode Effects for Computerized and Paper-and-Pencil Tests. , 2004 .

[14]  The International Test Commission International Guidelines on Computer-Based and Internet-Delivered Testing , 2006 .

[15]  Edward M. Levinson,et al.  A Review of the Computerized Version of the Self-Directed Search , 1990 .

[16]  Michael Russell,et al.  Mode of Administration Effects on MCAS Composition Performance for Grades Four, Eight, and Ten. A Report of Findings Submitted to the Massachusetts Department of Education. NBETPP Statements World Wide Web Bulletin. , 2000 .

[17]  Tom Plati,et al.  Effects of Computer Versus Paper Administrations of a State-Mandated Writing Assessment , 2022 .

[18]  Jong-Pil Kim,et al.  Meta-Analysis of Equivalence of Computerized and P&P Tests on Ability Measures , 1999 .

[19]  Michael Russell,et al.  Testing Writing on Computers: An Experiment Comparing Student Performance on Tests Conducted via Computer and via Paper-and-Pencil , 1997 .

[20]  Daniel R. Eignor,et al.  Guidelines for Computerized-Adaptive Test Development and Use in Education [Book Review]. , 1997 .

[21]  Cynthia G. Parshall,et al.  Practical Considerations in Computer-Based Testing , 2002 .

[22]  Mike Murphy,et al.  British Educational Research Association Annual Conference , 2009 .

[23]  Betty A. Bergstrom,et al.  Ability Measure Equivalence of Computer Adaptive and Pencil and Paper Tests: A Research Synthesis. , 1992 .

[24]  A. Jackson Stenner,et al.  Measuring Reading Comprehension with the Lexile Framework. , 1996 .

[25]  Shudong Wang,et al.  A Meta-Analysis of Testing Mode Effects in Grade K-12 Mathematics Tests , 2007 .

[26]  Douglas F. Becker,et al.  The Score Equivalence of Paper-and-Pencil and Computerized Versions of a Speeded Test of Reading Comprehension , 2002 .

[27]  G. Neuman,et al.  Computerization of Paper-and-Pencil Tests: When are They Equivalent? , 1998 .

[28]  F. Drasgow,et al.  Equivalence of computerized and paper-and-pencil cognitive ability tests: A meta-analysis. , 1993 .

[29]  Ann Marie Ryan,et al.  Test-taking dispositions: A missing link? , 1992 .

[30]  G. Gage Kingsbury,et al.  A comparison of achievement level estimates from computerized adaptive testing and paper-and-pencil testing , 1988 .

[31]  Randy Elliot Bennett,et al.  Inexorable and Inevitable: The Continuing Story of Technology and Assessment , 2002 .

[32]  Steven L. Wise,et al.  Research on the Effects of Administering Tests via Computers. , 1989 .

[33]  Amie Goldberg,et al.  The Effect of Computers on Student Writing: A Meta-analysis of Studies from 1992 to 2002 , 2003 .

[34]  Peter J. Denning,et al.  A nation at risk: the imperative for educational reform , 1983, CACM.

[35]  Kyoko Ito,et al.  Comparability of Scores from Norm-Referenced Paper-and-Pencil and Web-Based Linear Tests for Grades 4 - 12 , 2004 .

[36]  Mark D. Reckase,et al.  TECHNICAL GUIDELINES FOR ASSESSING COMPUTERIZED ADAPTIVE TESTS , 1984 .

[37]  Daniel R. Eignor,et al.  DERIVING COMPARABLE SCORES FOR COMPUTER ADAPTIVE AND CONVENTIONAL TESTS: AN EXAMPLE USING THE SAT1,2 , 1993 .

[38]  Larry V. Hedges,et al.  How hard is hard science, how soft is soft science? The empirical cumulativeness of research. , 1987 .

[39]  Cassandra Rowand,et al.  Teacher Use of Computers and the Internet in Public Schools. , 2000 .

[40]  Randy Elliot Bennett,et al.  How the Internet Will Help Large-Scale Assessment Reinvent Itself , 2001 .

[41]  Mary Pommerich,et al.  From Simulation to Application: Examinees React to Computerized Testing , 2000 .

[42]  R. Brennan,et al.  Test equating : methods and practices , 1995 .

[43]  Alija Kulenović,et al.  Standards for Educational and Psychological Testing , 1999 .

[44]  Shudong Wang,et al.  Administration Mode Comparability Study for Stanford Diagnostic Reading and Mathematics Tests , 2004 .

[45]  Daniel J. Mueller,et al.  Implications of Changing Answers on Objective Test Items. , 1977 .

[46]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[47]  Robert F. Conry,et al.  Effects of Computer-Based Tests on the Achievement, Anxiety, and Attitudes of Grade 10 Science Students , 1991 .

[48]  Stephen P. Klein,et al.  Large-Scale Testing: Current Practices and New Directions , 1999 .

[49]  Richard P. DeShon,et al.  Combining effect size estimates in meta-analysis with repeated measures and independent-groups designs. , 2002, Psychological methods.

[50]  Larry D. Evans,et al.  Children’s reading skills: A comparison of traditional and computerized assessment , 1995 .