"I Can Name that Bayesian Network in Two Matrixes!"

Abstract For a number of situations, a Bayesian network can be split into a core network consisting of a set of latent variables describing the status of a system, and a set of fragments relating the status variables to observable evidence that could be collected about the system state. This situation arises frequently in educational testing, where the status variables represent the student proficiency and the evidence models (graph fragments linking competency variables to observable outcomes) relate to assessment tasks that can be used to assess that proficiency. The traditional approach to knowledge engineering in this situation would be to maintain a library of fragments, where the graphical structure is specified using a graphical editor and then the probabilities are entered using a separate spreadsheet for each node. If many evidence model fragments employ the same design pattern, a lot of repetitive data entry is required. As the parameter values that determine the strength of the evidence can be buried on interior screens of an interface, it can be difficult for a design team to get an impression of the total evidence provided by a collection of evidence models for the system variables, and to identify holes in the data collection scheme. A Q -matrix – an incidence matrix whose rows represent observable outcomes from assessment tasks and whose columns represent competency variables – provides the graphical structure of the evidence models. The Q -matrix can be augmented to provide details of relationship strengths and provide a high level overview of the kind of evidence available. The relationships among the status variables can be represented with an inverse covariance matrix; this is particularly useful in models from the social sciences as often the domain experts’ knowledge about the system states comes from factor analyses and similar procedures that naturally produce covariance matrixes. The representation of the model using matrixes means that the bulk of the specification work can be done using a desktop spreadsheet program and does not require specialized software, facilitating collaboration with external experts. The design idea is illustrated with some examples from prior assessment design projects.

[1]  Acknowledgments , 2006, Molecular and Cellular Endocrinology.

[2]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[3]  E. Ziegel,et al.  Artificial intelligence and statistics , 1986 .

[4]  Russell G. Almond,et al.  Graphical Models and Computerized Adaptive Testing , 1998 .

[5]  Avi Pfeffer,et al.  Object-Oriented Bayesian Networks , 1997, UAI.

[6]  Russell G. Almond Graphical belief modeling , 1995 .

[7]  Kikumi K. Tatsuoka,et al.  Analysis of Errors in Fraction Addition and Subtraction Problems. Final Report. , 1984 .

[8]  Cecil R. Reynolds,et al.  Educational Testing Service , 2008 .

[9]  Robert E. Tarjan,et al.  Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs , 1984, SIAM J. Comput..

[10]  Kathryn B. Laskey,et al.  Network Engineering for Complex Belief Networks , 1996, UAI.

[11]  Martin Neil,et al.  Building large-scale Bayesian networks , 2000, The Knowledge Engineering Review.

[12]  Russell G. Almond,et al.  Bayesian Network Models for Local Dependence Among Observable Outcome Variables , 2006 .

[13]  Mark J. Gierl,et al.  Cognitive Diagnostic Assessment for Education: Using the Attribute Hierarchy Method to Make Diagnostic Inferences About Examinees' Cognitive Skills , 2007 .

[14]  Russell G. Almond,et al.  You Can't Fatten A Hog by Weighing It - Or Can You? Evaluating an Assessment for Learning System Called ACED , 2008, Int. J. Artif. Intell. Educ..

[15]  G. H. Fischer,et al.  The linear logistic test model as an instrument in educational research , 1973 .

[16]  Kathryn B. Laskey,et al.  Network Fragments: Representing Knowledge for Constructing Probabilistic Models , 1997, UAI.

[17]  David M. Williamson,et al.  Introduction to Evidence Centered Design and Lessons Learned From Its Application in a Global E-Learning Program , 2004 .

[18]  Jeffrey M. Bradshaw,et al.  eQuality: An Application of DDucks to Process Management , 1992, EKAW.

[19]  Frank Rijmen,et al.  Bayesian networks with a logistic regression model for the conditional probabilities , 2008, Int. J. Approx. Reason..

[20]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[21]  Andrew P. Sage,et al.  Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[22]  Russell G. Almond,et al.  Models for Conditional Probability Tables in Educational Assessment , 2001, AISTATS.

[23]  R. Almond,et al.  Focus Article: On the Structure of Educational Assessments , 2003 .