Interpretable Models for Understanding Immersive Simulations

This paper describes methods for comparative evaluation of the interpretability of models of high dimensional time series data inferred by unsupervised machine learning algorithms. The time series data used in this investigation were logs from an immersive simulation like those commonly used in education and healthcare training. The structures learnt by the models provide representations of participants' activities in the simulation which are intended to be meaningful to people's interpretation. To choose the model that induces the best representation, we designed two interpretability tests, each of which evaluates the extent to which a model’s output aligns with people’s expectations or intuitions of what has occurred in the simulation. We compared the performance of the models on these interpretability tests to their performance on statistical information criteria. We show that the models that optimize interpretability quality differ from those that optimize (statistical) information theoretic criteria. Furthermore, we found that a model using a fully Bayesian approach performed well on both the statistical and human-interpretability measures. The Bayesian approach is a good candidate for fully automated model selection, i.e., when direct empirical investigations of interpretability are costly or infeasible.

[1]  J. Slotta,et al.  Hybrid spaces for science learning: New demands and opportunities for research , 2020 .

[2]  Leilah Lyons,et al.  Connect-to-Connected Worlds: Piloting a Mobile, Data-Driven Reflection Tool for an Open-Ended Simulation at a Museum , 2019, CHI.

[3]  Avi Rosenfeld,et al.  Explainability in human–agent systems , 2019, Autonomous Agents and Multi-Agent Systems.

[4]  Ya'akov Gal,et al.  Modeling the Effects of Students' Interactions with Immersive Simulations using Markov Switching Systems , 2018, EDM.

[5]  Samuel J. Gershman,et al.  Human-in-the-Loop Interpretability Prior , 2018, NeurIPS.

[6]  Mike Wu,et al.  Beyond Sparsity: Tree Regularization of Deep Models for Interpretability , 2017, AAAI.

[7]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[8]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[9]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[10]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[11]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[12]  Guillaume Alinier,et al.  Immersive Clinical Simulation in Undergraduate Health Care Interprofessional Education: Knowledge and Perceptions , 2014 .

[13]  Ya'akov Gal,et al.  Plan Recognition and Visualization in Exploratory Learning Environments , 2013, TIIS.

[14]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[15]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[16]  Emily B. Fox,et al.  Bayesian nonparametric learning of complex dynamical phenomena , 2009 .

[17]  Michael I. Jordan,et al.  An HDP-HMM for systems with state persistence , 2008, ICML '08.

[18]  Yee Whye Teh,et al.  Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes , 2004, NIPS.

[19]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .