Model Criticism of Bayesian Networks with Latent Variables

The application of Bayesian networks (BNs) to cognitive assessment and intelligent tutoring systems poses new challenges for model construction. When cognitive task analyses suggest constructing a BN with several latent variables, empirical model criticism of the latent structure becomes both critical and complex. This paper introduces a methodology for criticizing models both globally (a BN in its entirety) and locally (observable nodes), and explores its value in identifying several kinds of misfit: node errors, edge errors, state errors, and prior probability errors in the latent structure. The results suggest the indices have potential for detecting model misfit and assisting in locating problematic components of the model.

[1]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[2]  David M. Williamson,et al.  "Mental Model" Comparison of Automated and Human Scoring , 1999 .

[3]  Russell G. Almond,et al.  Graphical Models and Computerized Adaptive Testing , 1998 .

[4]  Russell G. Almond,et al.  A cognitive task analysis with implications for designing simulation-based performance assessment☆ , 1999 .

[5]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[6]  David Heckerman,et al.  Asymptotic Model Selection for Directed Networks with Hidden Variables , 1996, UAI.

[7]  Russell G. Almond,et al.  Transfer of information between system and evidence models , 1999, AISTATS.

[8]  Robert J. Mislevy Evidence and inference in educational assessment , 1994 .

[9]  Sharon-Lise T. Normand [Bayesian Analysis in Expert Systems]: Comment , 1993 .

[10]  Robert J. Mislevy,et al.  The Role of Probability-Based Inference in an Intelligent Tutoring System. , 1995 .

[11]  L. Tashi,et al.  Transfer of Information. , 1986 .

[12]  B. deFinetti,et al.  METHODS FOR DISCRIMINATING LEVELS OF PARTIAL KNOWLEDGE CONCERNING A TEST ITEM. , 1965, The British journal of mathematical and statistical psychology.

[13]  S. Chipman,et al.  Cognitively diagnostic assessment , 1995 .

[14]  Russell G. Almond,et al.  On the Roles of Task Model Variables in Assessment Design. , 1999 .

[15]  Isaac I. Bejar,et al.  A methodology for scoring open-ended architectural design problems. , 1991 .

[16]  Robert J. Mislevy,et al.  Test Theory for A New Generation of Tests , 1994 .

[17]  Linda S. Steinberg,et al.  INTELLIGENT TUTORING AND ASSESSMENT BUILT ON AN UNDERSTANDING OF A TECHNICAL PROBLEM‐SOLVING TASK , 1996 .

[18]  A. H. Murphy,et al.  Probability Forecasting in Meteorology , 1984 .

[19]  David J. Spiegelhalter,et al.  Sequential Model Criticism in Probabilistic Expert Systems , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  R. Mislevy Evidence and inference in educational assessment , 1994 .

[21]  Robert J. Mislevy,et al.  Test Theory Reconceived , 1996 .

[22]  S. Embretson A cognitive design system approach to generating valid tests : Application to abstract reasoning , 1998 .

[23]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[24]  S. Embretson CONSTRUCT VALIDITY: CONSTRUCT REPRESENTATION VERSUS NOMOTHETIC SPAN , 1983 .

[25]  Edward S. Epstein,et al.  A Scoring System for Probability Forecasts of Ranked Categories , 1969 .

[26]  George E. P. Box,et al.  Sampling and Bayes' inference in scientific modelling and robustness , 1980 .

[27]  W. Weaver,et al.  Probability, rarity, interest, and surprise. , 1948, The Scientific monthly.

[28]  Russell G. Almond,et al.  Graphical Models and Computerized Adaptive Testing , 1999 .