Statistical Analysis of Item Preknowledge in Educational Tests: Latent Variable Modelling and Statistical Decision Theory

Tests are a building block of our modern education system. Many tests are high-stake, such as admission, licensing, and certification tests, that can significantly change one's life trajectory. For this reason, ensuring fairness in educational tests is becoming an increasingly important problem. This paper concerns the issue of item preknowledge in educational tests due to item leakage. That is, a proportion of test takers have access to leaked items before a test is administrated, which leads to inflated performance on the set of leaked items. We develop methods for the simultaneous detection of cheating test takers and compromised items based on data from a single test administration, when both sets are completely unknown. Latent variable models are proposed for the modelling of (1) data consisting only of item-level binary scores and (2) data consisting of both item-level binary scores and response time, where the former is commonly available in paper-and-pencil tests and the latter is widely encountered in computer-based tests. The proposed model adds a latent class model component upon a factor model (also known as item response theory model) component, where the factor model component captures item response behavior driven by test takers' ability and the latent class model component captures item response behavior due to item preknowledge. We further propose a statistical decision framework, under which compound decision rules are developed that control local false discovery/nondiscovery rates. Statistical inference is carried out under a Bayesian framework. The proposed method is applied to data from a computer-based nonadaptive licensure assessment.

[1]  Alvaro Riascos,et al.  On the Optimality of Answer-Copying Indices , 2015 .

[2]  Cun-Hui Zhang,et al.  Compound decision theory and empirical bayes methods , 2003 .

[3]  Bin Yu,et al.  Co-clustering directed graphs to discover asymmetries and directional communities , 2016, Proceedings of the National Academy of Sciences.

[4]  James O. Ramsay,et al.  Binomial Regression with Monotone Splines: A Psychometric Application , 1989 .

[5]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[6]  Matthias von Davier,et al.  Mixture Distribution Rasch Models , 1995 .

[7]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[8]  Willem J. van der Linden A bivariate lognormal response-time model for the detection of collusion between test takers , 2009 .

[9]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[10]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[11]  Dmitry I. Belov Comparing the Performance of Eight Item Preknowledge Detection Statistics , 2016, Applied psychological measurement.

[12]  W. D. Linden,et al.  Bayesian Checks on Cheating on Tests , 2015, Psychometrika.

[13]  M. Newton Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis , 2008 .

[14]  H. Robbins Asymptotically Subminimax Solutions of Compound Statistical Decision Problems , 1985 .

[15]  J WIM,et al.  A HIERARCHICAL FRAMEWORK FOR MODELING SPEED AND ACCURACY ON TEST ITEMS , 2007 .

[16]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[17]  Yunxiao Chen,et al.  Compound Sequential Change Point Detection in Multiple Data Streams , 2019 .

[18]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[19]  Yichao Wu,et al.  FULLY EFFICIENT ROBUST ESTIMATION, OUTLIER DETECTION AND VARIABLE SELECTION VIA PENALIZED REGRESSION , 2018 .

[20]  W. D. Linden,et al.  Handbook of item response theory , 2015 .

[21]  Sun-Joo Cho,et al.  Explanatory Secondary Dimension Modeling of Latent Differential Item Functioning , 2011 .

[22]  Hua-Hua Chang,et al.  The linear transformation model with frailties for the analysis of item response times. , 2013, The British journal of mathematical and statistical psychology.

[23]  Gregory J. Cizek,et al.  Handbook of Quantitative Methods for Detecting Cheating on Tests , 2016 .

[24]  Zhan Shu,et al.  Using Deterministic, Gated Item Response Theory Model to Detect Test Cheating due to Item Compromise , 2013, Psychometrika.

[25]  Neal M. Kingston,et al.  Test Fraud : Statistical Detection and Methodology , 2014 .

[26]  Kristof Baten,et al.  A Joint Modeling Approach for Reaction Time and Accuracy in Psycholinguistic Experiments , 2011 .

[27]  David J. Bartholomew,et al.  Latent Variable Models and Factor Analysis: A Unified Approach , 2011 .

[28]  James A. Wollack,et al.  Handbook of Test Security , 2013 .

[29]  van der Linden,et al.  A hierarchical framework for modeling speed and accuracy on test items , 2007 .

[30]  Sun-Joo Cho,et al.  An NCME Instructional Module on Latent DIF Analysis Using Mixture Item Response Models , 2016 .

[31]  M. Reckase Multidimensional Item Response Theory , 2009 .

[32]  Sandip Sinharay Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known? , 2017, Applied psychological measurement.

[33]  Yiyuan She,et al.  Outlier Detection Using Nonconvex Penalized Regression , 2010, ArXiv.

[34]  Jürgen Rost,et al.  Rasch Models in Latent Classes: An Integration of Two Approaches to Item Analysis , 1990 .

[35]  Howard Wainer,et al.  The Case for Bayesian Methods when Investigating Test Fraud , 2016 .

[36]  Dmitry I. Belov Detection of Test Collusion via Kullback–Leibler Divergence , 2013 .

[37]  S. Sinharay Detection of Item Preknowledge Using Likelihood Ratio Test and Score Test , 2017 .

[38]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[39]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[40]  Wenchao Ma Diagnostic Measurement: Theory, Methods, and Applications , 2018 .

[41]  Roy E. Welsch,et al.  Robust variable selection using least angle regression and elemental set sampling , 2007, Comput. Stat. Data Anal..

[42]  Yoav Benjamini,et al.  Microarrays, Empirical Bayes and the Two-Groups Model. Comment. , 2008 .

[43]  Isaac Dialsingh,et al.  Large-scale inference: empirical Bayes methods for estimation, testing, and prediction , 2012 .