Feature Importance Explanations for Temporal Black-Box Models

Models in the supervised learning framework may capture rich and complex representations over the features that are hard for humans to interpret. Existing methods to explain such models are often specific to architectures and data where the features do not have a time-varying component. In this work, we propose TIME, a method to explain models that are inherently temporal in nature. Our approach (i) uses a model-agnostic permutationbased approach to analyze global feature importance, (ii) identifies the importance of salient features with respect to their temporal ordering as well as localized windows of influence, and (iii) uses hypothesis testing to provide statistical rigor.

[1]  Mike Wu,et al.  Beyond Sparsity: Tree Regularization of Deep Models for Interpretability , 2017, AAAI.

[2]  Pedro Saleiro,et al.  TimeSHAP: Explaining Recurrent Models through Sequence Perturbations , 2021, KDD.

[3]  David Duvenaud,et al.  What went wrong and when? Instance-wise feature importance for time-series black-box models , 2020, NeurIPS.

[4]  Kai-Florian Storch,et al.  Extensive and divergent circadian gene expression in liver and heart , 2002, Nature.

[5]  Soheil Feizi,et al.  Input-Cell Attention Reduces Vanishing Saliency of Recurrent Neural Networks , 2019, NeurIPS.

[6]  Bertrand Michel,et al.  Grouped variable importance with random forests and application to multiple functional data analysis , 2014, Comput. Stat. Data Anal..

[7]  Cynthia Rudin,et al.  All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously , 2019, J. Mach. Learn. Res..

[8]  Scott Lundberg,et al.  Explaining by Removing Explaining by Removing: A Unified Framework for Model Explanation , 2020 .

[9]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[10]  Scott Lundberg,et al.  Understanding Global Feature Contributions With Additive Importance Measures , 2020, NeurIPS.

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Aram Galstyan,et al.  Multitask learning and benchmarking with clinical time series data , 2017, Scientific Data.

[13]  Giles Hooker,et al.  Unbiased Measurement of Feature Importance in Tree-Based Methods , 2019, ACM Trans. Knowl. Discov. Data.

[14]  Thomas Lengauer,et al.  Permutation importance: a corrected feature importance measure , 2010, Bioinform..

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[17]  Peter Szolovits,et al.  Clinical Intervention Prediction and Understanding using Deep Networks , 2017, ArXiv.

[18]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[19]  Tommi S. Jaakkola,et al.  Game-Theoretic Interpretability for Temporal Modeling , 2018, ArXiv.

[20]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[21]  Gemma C. Garriga,et al.  Permutation Tests for Studying Classifier Performance , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[22]  Giles Hooker,et al.  Detecting Feature Interactions in Bagged Trees and Random Forests , 2014, 1406.1845.

[23]  Min Chi,et al.  ATTAIN: Attention-based Time-Aware LSTM Networks for Disease Progression Modeling , 2019, IJCAI.

[24]  D. Yekutieli Hierarchical False Discovery Rate–Controlling Methodology , 2008 .

[25]  Carolin Strobl,et al.  The behaviour of random forest permutation-based variable importance measures under predictor correlation , 2010, BMC Bioinformatics.

[26]  Panagiotis Papapetrou,et al.  A peek into the black box: exploring classifiers by randomization , 2014, Data Mining and Knowledge Discovery.

[27]  Andrey A. Ptitsyn,et al.  Circadian Clocks Are Resounding in Peripheral Tissues , 2006, PLoS Comput. Biol..

[28]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[29]  Jesse Thomason,et al.  Interpreting Black Box Models via Hypothesis Testing , 2019, FODS.

[30]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[31]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[32]  Soheil Feizi,et al.  Benchmarking Deep Learning Interpretability in Time Series Predictions , 2020, NeurIPS.

[33]  Mark Craven,et al.  Understanding Learned Models by Identifying Important Features at the Right Resolution , 2018, AAAI.

[34]  Bertrand Michel,et al.  Correlation and variable importance in random forests , 2013, Statistics and Computing.

[35]  John W. Paisley,et al.  Global Explanations of Neural Networks: Mapping the Landscape of Predictions , 2019, AIES.