TEST THEORY RECONCEIVED

Educational test theory consists of statistical and methodological tools to support inference about examinees' knowledge, skills, and accomplishments. The evolution of test theory has been shaped by the nature of users' inferences, which, until recently, have been framed almost exclusively in terms of trait and behavioral psychology. Progress in the methodology of test theory enabled users to extend the range of inference, sharpen the logic, and ground their interpretations more solidly within these psychological paradigms. In particular, the focus remained on students' overall tendency to perform in prespecified ways in prespecified domains of tasks; for example, to make correct answers to mixed-number subtraction problems. Developments in cognitive and developmental psychology broaden the range of desired inferences, especially to conjectures about the nature and acquisition of students' knowledge. Commensurately broader ranges of data-types and student models are entertained. The same underlying principles of inference that led to standard test theory can be applied to support inference in this broader universe of discourse. Familiar models and methods—sometimes extended, sometimes reinterpreted, sometimes applied to problems wholly different from those for which they were first devised—can play a useful role to this end.

[1]  Donald B. Rubin,et al.  The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles. , 1974 .

[2]  Robert J. Mislevy,et al.  Randomization-based inference about latent variables from complex samples , 1991 .

[3]  K. VanLehn Problem solving and cognitive skill acquisition , 1989 .

[4]  F. Y. Edgeworth XXII. Correlated averages , 1892 .

[5]  Keam-Claude Falmagne,et al.  A latent trait theory via a stochastic learning theory for a knowledge space , 1989 .

[6]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[7]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters , 1982 .

[8]  R. Shavelson,et al.  Research news and Comment: Performance Assessments , 1992 .

[9]  Randy Elliot Bennett,et al.  TOWARD INTELLIGENT ASSESSMENT: AN INTEGRATION OF CONSTRUCTED RESPONSE TESTING, ARTIFICIAL INTELLIGENCE, AND MODEL‐BASED MEASUREMENT , 1990 .

[10]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[11]  John Seely Brown,et al.  Sophisticated Instructional Environment for Teaching Electronic Troubleshooting. , 1974 .

[12]  Marvin Minsky,et al.  A framework for representing knowledge" in the psychology of computer vision , 1975 .

[13]  W. Brewer,et al.  Theories of Knowledge Restructuring in Development , 1987 .

[14]  Paul J. Feltovich,et al.  Categorization and Representation of Physics Problems by Experts and Novices , 1981, Cogn. Sci..

[15]  C. Spearman,et al.  Demonstration of Formulae for True Measurement of Correlation , 1907 .

[16]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[17]  R. Mislevy Evidence and inference in educational assessment , 1994 .

[18]  Kikumi K. Tatsuoka Validation of Cognitive Sensitivity for Item Response Curves , 1987 .

[19]  K. Tatsuoka RULE SPACE: AN APPROACH FOR DEALING WITH MISCONCEPTIONS BASED ON ITEM RESPONSE THEORY , 1983 .

[20]  L. Cronbach,et al.  How we should measure "change": Or should we? , 1970 .

[21]  W. Edwards Deming,et al.  Out of the Crisis , 1982 .

[22]  Robert J. Mislevy,et al.  PROBABILITY-BASED INFERENCE IN A DOMAIN OF PROPORTIONAL REASONING TASKS , 1992 .

[23]  David Andrich Computerized Adaptive Testing: A Primer Book Review. , 1995 .

[24]  David A. Schum,et al.  Evidence and inference for the intelligence analyst , 1987 .

[25]  G. H. Fischer,et al.  Logistic latent trait models with linear constraints , 1983 .

[26]  Robert J. Mislevy,et al.  Integrating Cognitive and Psychometric Models to Measure Document Literacy. , 1990 .

[27]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[28]  Irwin S. Kirsch,et al.  Toward an explanatory model of document literacy , 1991 .

[29]  T. Kuhn The Structure of Scientific Revolutions 2nd edition , 1970 .

[30]  R. Stake The Teacher, Standardized Testing, and Prospects of Revolution. , 1991 .

[31]  Patrick C. Kyllonen,et al.  Effects of Aptitudes, Strategy Training, and Task Facets on Spatial Task Performance. , 1984 .

[32]  R. Glaser,et al.  The Future of Testing: A Research Agenda for Cognitive Psychology and Psychometrics. , 1981 .

[33]  J. D. Anderson,et al.  Towards a theory of algorithm-determined cognitive test construction , 1990 .

[34]  K. VanLehn Mind Bugs: The Origins of Procedural Misconceptions , 1990 .

[35]  D. Dunner,et al.  Diagnostic assessment. , 1993, The Psychiatric clinics of North America.

[36]  Irwin S. Kirsch,et al.  Literacy, profiles of America's young adults , 1986 .

[37]  F. Y. Edgeworth I.—The Statistics of Examinations , 1888 .

[38]  C. Lewis Test theory and psychometrika: The past twenty-five years , 1986 .

[39]  Susy Macqueen,et al.  Validity , 1973, Just Algorithms.

[40]  LXI. The law of error and correlated averages , 1892 .

[41]  J. Millman,et al.  The specification and development of tests of achievement and ability. , 1989 .

[42]  R. Glaser,et al.  The Acquisition of Perceptual Diagnostic Skill in Radiology. , 1981 .

[43]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[44]  R. Hambleton Principles and selected applications of item response theory. , 1989 .

[45]  Judith D. Wilson,et al.  Artificial Intelligence and Tutoring Systems , 1990 .

[46]  Robert J. Mislevy,et al.  PROBABILITY‐BASED INFERENCE IN COGNITIVE DIAGNOSIS , 1994 .

[47]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[48]  Richard M. Smith Person Fit in the Rasch Model , 1986 .

[49]  Isaac I. Bejar,et al.  A Generative Approach to Psychological and Educational Measurement. , 1991 .

[50]  Robert J. Mislevy,et al.  HOW TO EQUATE TESTS WITH LITTLE OR NO DATA , 1992 .

[51]  John R. Frederiksen Implicit testing within an intelligent tutoring system , 1987 .

[52]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[53]  W. McKeachie Implications of cognitive psychology for college teaching , 1980 .

[54]  Stephen K. Reed,et al.  Book reviewIntelligence, information processing, and analogical reasoning: by Robert J. Sternberg. Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1977. xi + 348 pp. $19.95 , 1977 .

[55]  Frederic M. Lord,et al.  Further Problems in the Measurement of Growth , 1958 .

[56]  Kristian G. Olesen,et al.  HUGIN - A Shell for Building Bayesian Belief Universes for Expert Systems , 1989, IJCAI.

[57]  K. Tatsuoka Toward an Integration of Item-Response Theory and Cognitive Error Diagnosis. , 1987 .

[58]  R. Snow,et al.  Implications of cognitive psychology for educational measurement. , 1989 .

[59]  Robert J. Mislevy,et al.  Monitoring and Improving a Portfolio Assessment System. , 1995 .

[60]  Patrick W Thompson,et al.  Were lions to speak, we wouldn’t understand , 1982 .

[61]  D. Campbell,et al.  Convergent and discriminant validation by the multitrait-multimethod matrix. , 1959, Psychological bulletin.

[62]  P. Johnson-Laird,et al.  Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness , 1985 .

[63]  D. Schum The Evidential Foundations of Probabilistic Reasoning , 1994 .

[64]  S. Embretson Test design : developments in psychology and psychometrics , 1985 .

[65]  Susan E. Embretson,et al.  EFFECTS OF PROSE COMPLEXITY ON ACHIEVEMENT TEST ITEM DIFFICULTY , 1991 .

[66]  John Seely Brown,et al.  SOPHIE: a pragmatic use of artificial intelligence in CAI , 1974, ACM '74.

[67]  David M. Shoemaker,et al.  Toward a Framework for Achievement Testing , 1975 .

[68]  R. Sternberg,et al.  Intelligence, Information Processing and Analogical Reasoning : The Componential Analysis of Human Abilities , 1977 .

[69]  Bert F. Green,et al.  In defense of measurement. , 1978 .

[70]  R. Shavelson Performance Assessments: Political Rhetoric and Measurement Reality , 1992 .

[71]  E. Boring Intelligence as the Tests Test It. , 1961 .