Optimizing information using the EM algorithm in item response theory

Latent trait models such as item response theory (IRT) hypothesize a functional relationship between an unobservable, or latent, variable and an observable outcome variable. In educational measurement, a discrete item response is usually the observable outcome variable, and the latent variable is associated with an examinee’s trait level (e.g., skill, proficiency). The link between the two variables is called an item response function. This function, defined by a set of item parameters, models the probability of observing a given item response, conditional on a specific trait level.Typically in a measurement setting, neither the item parameters nor the trait levels are known, and so must be estimated from the pattern of observed item responses. Although a maximum likelihood approach can be taken in estimating these parameters, it usually cannot be employed directly. Instead, a method of marginal maximum likelihood (MML) is utilized, via the expectation-maximization (EM) algorithm. Alternating between an expectation (E) step and a maximization (M) step, the EM algorithm assures that the marginal log likelihood function will not decrease after each EM cycle, and will converge to a local maximum.Interestingly, the negative of this marginal log likelihood function is equal to the relative entropy, or Kullback-Leibler divergence, between the conditional distribution of the latent variables given the observable variables and the joint likelihood of the latent and observable variables. With an unconstrained optimization for the M-step proposed here, the EM algorithm as minimization of Kullback-Leibler divergence admits the convergence results due to Csiszár and Tusnády (Statistics & Decisions, 1:205–237, 1984), a consequence of the binomial likelihood common to latent trait models with dichotomous response variables. For this unconstrained optimization, the EM algorithm converges to a global maximum of the marginal log likelihood function, yielding an information bound that permits a fixed point of reference against which models may be tested. A likelihood ratio test between marginal log likelihood functions obtained through constrained and unconstrained M-steps is provided as a means for testing models against this bound. Empirical examples demonstrate the approach.

[1]  Wolfgang Jank,et al.  The EM Algorithm, Its Randomized Implementation and Global Optimization: Some Challenges and Opportunities for Operations Research , 2006 .

[2]  J. D. L. Torre,et al.  DINA Model and Parameter Estimation: A Didactic , 2009 .

[3]  C. Spearman CORRELATIONS OF SUMS OR DIFFERENCES , 1913 .

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  F. Baker,et al.  Item response theory : parameter estimation techniques , 1993 .

[6]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[7]  R. Hambleton,et al.  Item Response Theory , 1984, The History of Educational Measurement.

[8]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[9]  Richard J. Patz,et al.  A Straightforward Approach to Markov Chain Monte Carlo Methods for Item Response Models , 1999 .

[10]  Alexander Weissman Global Convergence of the EM Algorithm for Unconstrained Latent Variable Models with Categorical Indicators , 2013, Psychometrika.

[11]  C. Spearman,et al.  Demonstration of Formulae for True Measurement of Correlation , 1907 .

[12]  L. J. Bain,et al.  Introduction to Probability and Mathematical Statistics , 1987 .

[13]  Aris Spanos,et al.  Probability theory and statistical inference: econometric modelling with observational data , 1999 .

[14]  B. Junker,et al.  Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory , 2001 .

[15]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[16]  T. Minka Expectation-Maximization as lower bound maximization , 1998 .

[17]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[18]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[19]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[20]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[21]  R. Darrell Bock,et al.  Fitting a response model forn dichotomously scored items , 1970 .

[22]  Bruce L. Golden,et al.  Perspectives in Operations Research , 2006 .

[23]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[24]  Shun-ichi Amari,et al.  Information geometry of the EM and em algorithms for neural networks , 1995, Neural Networks.

[25]  A note on the geometric interpretation of the EM algorithm in estimating item characteristics and student abilities , 2000 .

[26]  Michael R. Harwell,et al.  Item Parameter Estimation Via Marginal Maximum Likelihood and an EM Algorithm: A Didactic , 1988 .

[27]  R. Hambleton,et al.  Item Response Theory: Principles and Applications , 1984 .

[28]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[29]  G. McLachlan,et al.  The EM Algorithm and Extensions: Second Edition , 2008 .

[30]  Roger J.-B. Wets,et al.  Statistical estimation from an optimization viewpoint , 1999, Ann. Oper. Res..

[31]  Detlef Prescher,et al.  A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars , 2004, ArXiv.

[32]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[33]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[34]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters , 1982 .

[35]  L. Crocker,et al.  Introduction to Classical and Modern Test Theory , 1986 .

[36]  Frank B. Baker,et al.  Item Response Theory : Parameter Estimation Techniques, Second Edition , 2004 .