Evaluating the Equal-Interval Hypothesis with Test Score Scales

The axioms of additive conjoint measurement provide a means of testing the hypothesis that testing data can be placed onto a scale with equal-interval properties. However, the axioms are difficult to verify given that item responses may be subject to measurement error. A Bayesian method exists for imposing order restrictions from additive conjoint measurement while estimating the probability of a correct response. In this study an improved version of that methodology is evaluated via simulation. The approach is then applied to data from a reading assessment intentionally designed to support an equal-interval scaling.

[1]  A. Kyngdon Conjoint Measurement, Error and the Rasch Model , 2008 .

[2]  J. Michell Conjoint Measurement and the Rasch Paradox , 2008 .

[3]  Benjamin D. Wright,et al.  A History of Social Science Measurement , 2005 .

[4]  B. Wright,et al.  Introduction to probabilistic conjoint measurement theory and applications , 1994 .

[5]  G. Masters,et al.  Mapping student achievement , 1994 .

[6]  Benjamin D. Wright,et al.  A Procedure for Sample-Free Item Analysis , 1969 .

[7]  A. Tversky,et al.  Foundations of Measurement, Vol. I: Additive and Polynomial Representations , 1991 .

[8]  Percy Williams Bridgman,et al.  The Logic of Modern Physics , 1927 .

[9]  Benjamin D. Wright,et al.  SAMPLE-FREE TEST CALIBRATION AND PERSON MEASUREMENT. PAPER PRESENTED AT THE NATIONAL SEMINAR ON ADULT EDUCATION RESEARCH (CHICAGO, FEBRUARY 11-13, 1968). , 1967 .

[10]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[11]  Margaret Wu,et al.  Properties of Rasch residual fit statistics. , 2013, Journal of applied measurement.

[12]  J. Michell Measurement in psychology: A critical history of a methodological concept. , 1999 .

[13]  Frederic M. Lord,et al.  An Upper Asymptote for the Three-Parameter Logistic Item-Response Model. , 1981 .

[14]  R. Brennan,et al.  Test Equating, Scaling, and Linking: Methods and Practices , 2004 .

[15]  van der Ark,et al.  Mokken Scale Analysis in R , 2007 .

[16]  Damian W. Betebenner Norm- and Criterion-Referenced Student Growth. , 2009 .

[17]  Edward R. Tufte,et al.  The Visual Display of Quantitative Information , 1986 .

[18]  Michael Smith,et al.  Educational applications of probabilistic conjoint measurement models , 1994 .

[19]  R. Luce,et al.  Simultaneous conjoint measurement: A new type of fundamental measurement , 1964 .

[20]  P. Holland On the sampling theory roundations of item response theory models , 1990 .

[21]  B. Wright,et al.  Best test design , 1979 .

[22]  D. Rubin For objective causal inference, design trumps analysis , 2008, 0811.1640.

[23]  Simon Jackman,et al.  Bayesian Analysis for the Social Sciences , 2009 .

[24]  Cees A. W. Glas,et al.  Testing the Rasch Model , 1995 .

[25]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[26]  Michell,et al.  The Axioms of Quantity and the Theory of Measurement , 1997, Journal of mathematical psychology.

[27]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[28]  L. Guttman,et al.  The Quantification of a class of attributes : A theory and method of scale construction , 1941 .

[29]  Attitudes, order and quantity: deterministic and direct probabilistic tests of unidimensional unfolding. , 2007, Journal of applied measurement.

[30]  Alfred North Whitehead,et al.  Principia Mathematica to *56 , 1910 .

[31]  Hartmann Scheiblechner,et al.  Isotonic ordinal probabilistic models (ISOP) , 1995 .

[32]  Patrick Suppes,et al.  Foundations of Measurement, Vol. II: Geometrical, Threshold, and Probabilistic Representations , 1989 .

[33]  A. Cameron,et al.  Microeconometrics: Methods and Applications , 2005 .

[34]  H. D. Hoover The Most Appropriate Scores for Measuring Educational Development in the Elementary Schools: GE's , 1984 .

[35]  D. Grayson,et al.  Two-group classification in latent trait theory: Scores with monotone likelihood ratio , 1988 .

[36]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[37]  J. Michell An introduction to the logic of psychological measurement , 1995 .

[38]  G. Masters A rasch model for partial credit scoring , 1982 .

[39]  A. Kyngdon Plausible measurement analogies to some psychometric models of test performance. , 2011, The British journal of mathematical and statistical psychology.

[40]  Joanna S. Gorin Test Design with Cognition in Mind , 2007 .

[41]  D. Scott Measurement structures and linear inequalities , 1964 .

[42]  On the Distribution of Measurements in Units that are not Arbitrary , 2003 .

[43]  Arnold L. van den Wollenberg,et al.  Two new test statistics for the rasch model , 1982 .

[44]  Peter C. Fishburn,et al.  Foundations of Measurement:@@@Volume II: Geometrical, Threshold, and Probabilistic Representations@@@Volume III: Representation, Axiomatization, and Invariance. , 1991 .

[45]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[46]  Jeremy D. Finn,et al.  Tennessee's Class Size Study: Findings, Implications, Misconceptions , 1999 .

[47]  Stephen Humphry,et al.  The Role of the Unit in Physics and Psychometrics , 2011 .

[48]  J. Linacre,et al.  Many-facet Rasch measurement , 1994 .

[49]  G. McClelland A note on Arbuckle and Larimer, “the number of two-way tables satisfying certain additivity axioms” , 1977 .

[50]  J. Kruskal Analysis of Factorial Experiments by Estimating Monotone Transformations of the Data , 1965 .

[51]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[52]  Stephen M. Humphry Modeling the Effects of Person Group Factors on Discrimination , 2010 .

[53]  A. Jackson Stenner,et al.  TOWARD A THEORY OF CONSTRUCT DEFINITION , 1983 .

[54]  Hubert E. Brogden,et al.  The rasch model, the law of comparative judgment and additive conjoint measurement , 1977 .

[55]  A Jackson Stenner,et al.  How accurate are lexile text measures? , 2006, Journal of applied measurement.

[56]  Howard Wainer,et al.  The Rasch Model as Additive Conjoint Measurement , 1979 .

[57]  G Gigerenzer,et al.  Are there limits to binaural additivity of loudness? , 1983, Journal of experimental psychology. Human perception and performance.

[58]  M. Wilson Detecting and Interpreting Local Item Dependence Using a Fannily of Rasch Models , 1988 .

[59]  Denny Borsboom,et al.  The Rasch Model and Conjoint Measurement Theory from the Perspective of Psychometrics , 2008 .

[60]  A. Tversky,et al.  Foundations of Measurement, Vol. III: Representation, Axiomatization, and Invariance , 1990 .

[61]  Mark Wilson,et al.  Constructing Measures: An Item Response Modeling Approach , 2004 .

[62]  Jürgen Rost,et al.  A Conditional Item-Fit Index for Rasch Models , 1994 .

[63]  Jim Albert,et al.  Ordinal Data Modeling , 2000 .

[64]  Klaas Sijtsma,et al.  A Method for Investigating the Intersection of Item Response Functions in Mokken's Nonparametric IRT Model , 1992 .

[65]  Herbert Dingle,et al.  A THEORY OF MEASUREMENT , 1950, The British Journal for the Philosophy of Science.

[66]  Andrew Kyngdon,et al.  The Rasch Model from the Perspective of the Representational Theory of Measurement , 2008 .

[67]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[68]  John B. Carroll,et al.  The American Heritage Word Frequency Book , 1971 .

[69]  David S. Behavioral,et al.  History as Social Science , 1971 .

[70]  G. Karabatsos,et al.  The Rasch model, additive conjoint measurement, and new models of probabilistic measurement theory. , 2001, Journal of applied measurement.

[71]  W. Meredith Measurement invariance, factor analysis and factorial invariance , 1993 .

[72]  S. Lazic,et al.  Introducing Monte Carlo Methods with R , 2012 .

[73]  H. Scheiblechner Additive conjoint isotonic probabilistic models (ADISOP) , 1999 .

[74]  Adrian F. M. Smith,et al.  Bayesian Analysis of Constrained Parameter and Truncated Data Problems , 1991 .

[75]  Michael R. Harwell,et al.  Rescaling Ordinal Data to Interval Data in Educational Research , 2001 .

[76]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[77]  Svend Kreiner,et al.  Applying the Rasch Model , 2012 .

[78]  N R Campbell The measurement of visual sensations , 1933 .

[79]  F. Baker,et al.  Item response theory : parameter estimation techniques , 1993 .

[80]  J. Leeuw,et al.  Abstract Measurement Theory. , 1986 .

[81]  I. W. Molenaar,et al.  Rasch models: foundations, recent developments and applications , 1995 .

[82]  A. Jackson Stenner,et al.  Measuring Reading Comprehension with the Lexile Framework. , 1996 .

[83]  Dale Ballou,et al.  Test Scaling and Value-Added Measurement , 2009, Education Finance and Policy.

[84]  R. Luce,et al.  Theory And Tests Of The Conjoint Commutativity Axiom For Additive Conjoint Measurement , 2011 .

[85]  Joel Michell,et al.  The logic of measurement: A realist overview , 2005 .

[86]  C. Davis-Stober Analysis of multinomial models under inequality constraints: Applications to measurement theory , 2009 .

[87]  A. Agresti,et al.  Analysis of Ordinal Categorical Data. , 1985 .

[88]  Jean-Claude Falmagne,et al.  Statistical issues in measurement , 1985 .

[89]  G Karabatsos A critique of Rasch residual fit statistics. , 2000, Journal of applied measurement.

[90]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[91]  S S Stevens,et al.  On the Theory of Scales of Measurement. , 1946, Science.

[92]  D. Borsboom Measuring the mind: Conceptual issues in contemporary psychometrics , 2005 .

[93]  Derek C. Briggs Measuring Growth With Vertical Scales , 2013 .

[94]  M. Wilson A Comparison of Deterministic and Probabilistic Approaches to Measuring Learning Structures , 1989 .

[95]  K. Green Fundamental Measurement: A Review and Application of Additive Conjoint Measurement in Educational Testing , 1986 .

[96]  J. Falmagne Random Conjoint Measurement and Loudness Summation. , 1976 .

[97]  J. Larimer,et al.  The number of two-way tables satisfying certain additivity axioms , 1976 .

[98]  D. Andrich A rating formulation for ordered response categories , 1978 .

[99]  D. Burdick,et al.  HOW TO MODEL AND TEST FOR THE MECHANISMS THAT MAKE MEASUREMENT SYSTEMS TICK , 2011 .

[100]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[101]  Derek C. Briggs,et al.  The Sensitivity of Value-Added Modeling to the Creation of a Vertical Score Scale , 2009, Education Finance and Policy.

[102]  Gerhard H. Fischer,et al.  Linear Logistic Models for Change , 1995 .

[103]  Joel Mitchell,et al.  Some problems in testing the double cancellation condition in conjoint measurement , 1988 .

[104]  J. Michell Item Response Models, Pathological Science and the Shape of Error , 2004 .

[105]  G. McClelland,et al.  Scaling Distortion in Numerical Conjoint Measurement , 1984 .

[106]  Patrick Suppes,et al.  Basic measurement theory , 1962 .