Reliability and validity of a mathematics performance assessment

The QUASAR Cognitive Assessment Instrument (QCAI) is designed to measure program outcomes and growth in mathematics. It consists of a relatively large set of open-ended tasks that assess mathematical problem solving, reasoning, and communication at the middle-school grade levels. This study provides some evidence for the generalizability and validity of the assessment. The results from the generalizability studies indicate that the error due to raters is minimal, whereas there is considerable differential student performance across tasks. The dependability of grade level scores for absolute decision making is encouraging; when the number of students is equal to 350, the coefficients are between .80 and .97 depending on the form and grade level. As expected, there tended to be a higher relationship between the QCAI scores and both the problem solving and conceptual subtest scores from a mathematics achievement multiple-choice test than between the QCAI scores and the mathematics computation subtest scores. Mathematics reformers (e.g., National Council of Teachers of Mathematics, 1989) suggest that the intent of mathematics instruction should be to promote

[1]  R. Brennan Elements of generalizability theory , 1983 .

[2]  Stephen B. Dunbar,et al.  Quality Control in the Development and Use of Performance Assessments , 1991 .

[3]  Suzanne Lane,et al.  Use of Generalizability Theory for Estimating the Dependability of a Scoring System for Sample Essays , 1989 .

[4]  K. Jöreskog A general approach to confirmatory maximum likelihood factor analysis , 1969 .

[5]  Robert L. Linn,et al.  Educational Assessment: Expanded Expectations and Challenges , 1993 .

[6]  Karl G. Jöreskog,et al.  LISREL 7: A guide to the program and applications , 1988 .

[7]  L. Crocker,et al.  Validation Methods for Direct Writing Assessment , 1990 .

[8]  Jay Magidson,et al.  Advances in factor analysis and structural equation models , 1979 .

[9]  Samuel Messick,et al.  The Interplay of Evidence and Consequences in the Validation of Performance Assessments. Research Report. , 1992 .

[10]  R. Shavelson,et al.  Research news and Comment: Performance Assessments , 1992 .

[11]  J. Frederiksen,et al.  A Systems Approach to Educational Testing , 1989 .

[12]  R. Shavelson Performance Assessments: Political Rhetoric and Measurement Reality , 1992 .

[13]  C. Hirsch Curriculum and Evaluation Standards for School Mathematics , 1988 .

[14]  R. Shavelson Performance Assessment in Science , 1991 .

[15]  Stephen B. Dunbar,et al.  Complex, Performance-Based Assessment: Expectations and Validation Criteria , 1991 .

[16]  M. Browne Asymptotically distribution-free methods for the analysis of covariance structures. , 1984, The British journal of mathematical and statistical psychology.