The Cost of Science Performance Assessments in Large-Scale Testing Programs

Estimates of the costs of including hands-on measures of science skills in large-scale assessment programs are drawn from a field trial involving more than 2,000 fifth- and sixth-grade students. These estimates include the resources needed to develop, administer, and score the tasks. They suggest that performance measures are far more expensive than typical multiple-choice tests for an equal amount of testing time, and the cost increases even further for an equally reliable score on an individual student. Because of the complexities of equipment and materials, hands-on measures in science are about three times more expensive than open-ended writing assessments. Alternative approaches to development and administration (such as using less expensive equipment and having the tasks administered by classroom teachers rather than trained Exercise Administrators) could reduce costs by up to 50%, but these practices may reduce the quality of the data obtained. However, including performance assessments in a state’s testing program may have many positive effects, including fostering standards-based educational reform and encouraging more effective teaching methods. The challenge is to determine whether these potential benefits actually exist and if they do, how they can be realized within the budget constraints of most testing programs.

[1]  Robert L. Brennan,et al.  The Conventional Wisdom About Group Mean Scores , 1995 .

[2]  Mei Liu,et al.  Reliability and validity of a mathematics performance assessment , 1994 .

[3]  Brian M. Stecher,et al.  Perceived Effects of the Kentucky Instructional Results Information System (KIRIS). , 1996 .

[4]  Lorrie A. Shepard,et al.  Effects of High-Stakes Testing on Instruction. , 1991 .

[5]  Brian M. Stecher,et al.  Performance Assessments in Science , 1996 .

[6]  Roy A. Hardy,et al.  Examining the Costs of Performance Assessment , 1995 .

[7]  Henry M. Levin,et al.  Cost-Effectiveness: A Primer , 1985 .

[8]  J. Shea National Science Education Standards , 1995 .

[9]  Stephen P. Klein,et al.  Performance Assessments in Science: Hands-on Tasks and Scoring Guides , 1996 .

[10]  Laura S. Hamilton,et al.  An Investigation of Students' Affective Responses to Alternative Assessment Formats. , 1994 .

[11]  Guillermo Solano-Flores,et al.  Gender and Racial/Ethnic Differences on Performance Assessments in Science , 1997 .

[12]  Daniel Koretz,et al.  The Reliability of Scores from the 1992 Vermont Portfolio Assessment Program. Interim Report. , 1992 .

[13]  Guillermo Solano-Flores,et al.  Performance-Based Assessments , 1994 .

[14]  Zollie Stevenson The Reliability of Using a Focused-Holistic Scoring Approach To Measure Student Performance on a Geometry Proof. , 1990 .

[15]  R. Shavelson Performance Assessment in Science , 1991 .

[16]  Susan R. Goldman,et al.  Evaluation of Procedure-Based Scoring for Hands-On Science Assessment , 1992 .

[17]  Robert L. Linn,et al.  The Effects of High-Stakes Testing On Achievement: Preliminary Findings About Generalization Across Tests , 1991 .

[18]  M. Smith,et al.  Unintended Consequences of External Testing in Elementary Schools , 1991 .

[19]  David R. Montgomery The conventional wisdom , 1972 .

[20]  H. Wainer,et al.  COMBINING MULTIPLE-CHOICE AND CONSTRUCTED RESPONSE TEST SCORES: TOWARD A MARXIST THEORY OF TEST CONSTRUCTION , 1992 .

[21]  W. James Popham,et al.  Circumventing the High Costs of Authentic Assessment , 1993 .

[22]  Joan L. Herman,et al.  Using Portfolios for Large-Scale Assessment , 1996 .

[23]  R. Shavelson,et al.  Research news and Comment: Performance Assessments , 1992 .

[24]  R. Shavelson Performance Assessments: Political Rhetoric and Measurement Reality , 1992 .