Extending Classical Test Theory to the Measurement of Change

The concept of a change score has considerable intuitive appeal. A person subtracts last week's weight from today's weight and talks of having gained or lost five pounds. Yet, change scores have more than their share of conceptual problems. Weights are comparable a two-hundred-pounder outweighs a onehundred-pounder regardless of his other traits; but changes are not necessarily comparable a loss of twenty-five pounds may be a godsend for one individual but a disaster for another. Even in cases where changes in one direction are preferred, certain comparisons of changes appear inappropriate. For example, an instructor may grade physical education students on their improvement in running the mile. All of the students running an eight-minute mile at the beginning of the course may cut more than a minute out of their times; none of the four-minute milers are likely to improve by more than a few seconds. Clearly, the eight-minute milers "improved" their time by more seconds than did the four-minute milers. Yet no instructor would give A's to the slowest runners and F's to the fastest, regardless of his commitment to the concept of grading on improvement. Somehow these "improvements" are not comparable for the purposes of evaluation. This inability to directly compare changes at different points of the scale, even with ratio scales, is the fundamental problem of the measurement of change. The comparability problem is related to the fact that change scores are generally correlated with initial status. When change and initial status are negatively correlated, low-scorers have an advantage in the sense they are likely to gain more. Similarly, in rarer instances when change and initial status are positively correlated, the initially highscoring individuals have the advantage.

[1]  J. Guilford Psychometric methods, 2nd ed. , 1954 .

[2]  Philip H. Dubois,et al.  Correlational Methods in Research on Human Learning , 1962 .

[3]  E. Thorndike,et al.  The Influence of the Chance Imperfections of Measures upon the Relation of Initial Score to Gain or Loss. , 1924 .

[4]  R. F. Garside,et al.  The regression of gains upon initial scores , 1956 .

[5]  L. Zieve Note on the correlation of initial scores with gains. , 1940 .

[6]  H. Woodrow The ability to learn. , 1946, Psychological review.

[7]  Jerome Hellmuth,et al.  Compensatory education, a national debate , 1970 .

[8]  G. H. Thomson,et al.  A formula to correct for the effect of errors of measurement on the correlation of initial values with gains. , 1924 .

[9]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[10]  H. Gulliksen Theory of mental tests , 1952 .

[11]  Henry S. Dyer,et al.  Feasibility Study of Educational Performance Indicators. Final Report. , 1967 .

[12]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[13]  D. W. Zimmerman,et al.  Effect of Chance Success Due to Guessing on Error of Measurement in Multiple-Choice Tests , 1965 .

[14]  G. Thomson An alternative formula for the true correlation of initial values with gains. , 2022 .

[15]  Frederic M. Lord,et al.  The Measurement of Growth , 1956 .

[16]  R. L. Thorndike Intellectual status and intellectual growth. , 1966, Journal of educational psychology.

[17]  W. A. Mehrens,et al.  Standardized tests in education , 1969 .

[18]  R. H. Williams,et al.  Chance Success Due to Guessing and Non-Independence of True Scores and Error Scores in Multiple-Choice Tests: Computer Trials with Prepared Distributions , 1965, Psychological reports.

[19]  P. H. Dubois,et al.  GAIN IN PROFICIENCY AS A CRITERION IN TEST VALIDATION , 1958 .

[20]  L. Tucker,et al.  A base-free measure of change. , 1966, Psychometrika.

[21]  F. Lord STATISTICAL ADJUSTMENTS WHEN COMPARING PREEXISTING GROUPS , 1968 .

[22]  R. Linn,et al.  A general linear model for studying growth. , 1970 .

[23]  D W Zimmerman,et al.  Generalization of the Spearman-Brown formula for test reliability: the case of non-independence of true scores and error scores. , 1966, The British journal of mathematical and statistical psychology.

[24]  Edward F. O'connor Response to Cronbach and Furby's "How we should measure "change': Or should we?" , 1972 .

[25]  Selby H Evans,et al.  Misuse of analysis of covariance when treatment effect and covariate are confounded. , 1968, Psychological bulletin.