论文信息 - Regional Tree Regularization for Interpretability in Deep Neural Networks - 字舞流文

Regional Tree Regularization for Interpretability in Deep Neural Networks

The lack of interpretability remains a barrier to adopting deep neural networks across many safety-critical domains. Tree regularization was recently proposed to encourage a deep neural network's decisions to resemble those of a globally compact, axis-aligned decision tree. However, it is often unreasonable to expect a single tree to predict well across all possible inputs. In practice, doing so could lead to neither interpretable nor performant optima. To address this issue, we propose regional tree regularization – a method that encourages a deep model to be well-approximated by several separate decision trees specific to predefined regions of the input space. Across many datasets, including two healthcare applications, we show our approach delivers simpler explanations than other regularization schemes without compromising accuracy. Specifically, our regional regularizer finds many more “desirable” optima compared to global analogues.

V. Roth | L. Celi | S. Parbhoo | Mike Wu | M. Zazzi | R. Kindle | M. Hughes | F. Doshi-Velez

[1] Quanshi Zhang,et al. Towards a Deep and Unified Understanding of Deep Neural Models in NLP , 2019, ICML.

[2] Ofra Amir,et al. HIGHLIGHTS: Summarizing Agent Behavior to People , 2018, AAMAS.

[3] Geoffrey E. Hinton,et al. Distilling a Neural Network Into a Soft Decision Tree , 2017, CEx@AI*IA.

[4] Mike Wu,et al. Beyond Sparsity: Tree Regularization of Deep Models for Interpretability , 2017, AAAI.

[5] Jonathan H. Chen,et al. Machine Learning and Prediction in Medicine - Beyond the Peak of Inflated Expectations. , 2017, The New England journal of medicine.

[6] Wojciech Samek,et al. Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[7] Tim Miller,et al. Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[8] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[9] Andrew Slavin Ross,et al. Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.

[10] Karen M. Feigh,et al. Learning From Explanations Using Sentiment and Advice in RL , 2017, IEEE Transactions on Cognitive and Developmental Systems.

[11] Subhashini Venugopalan,et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[12] Abhishek Das,et al. Grad-CAM: Why did you say that? , 2016, ArXiv.

[13] Carlos Guestrin,et al. Programs as Black-Box Explanations , 2016, ArXiv.

[14] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.

[15] Li Li,et al. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[16] Peter Szolovits,et al. MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[17] Marco Tulio Ribeiro,et al. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[18] Ramón Fernández Astudillo,et al. From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification , 2016, ICML.

[19] Yan Liu,et al. Distilling Knowledge from Deep Networks with Applications to Healthcare Domain , 2015, ArXiv.

[20] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[21] Cynthia Rudin,et al. The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[22] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[23] Thomas Lengauer,et al. Prediction of response to antiretroviral therapy by human experts and by the EuResist data‐driven expert system (the EVE study) , 2010, HIV medicine.

[24] J. Ross Quinlan,et al. Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[25] Alexander Binder,et al. Layer-Wise Relevance Propagation for Deep Neural Network Architectures , 2016 .

[26] Been Kim,et al. Interactive and interpretable machine learning models for human machine collaboration , 2015 .

[27] Alexander Mordvintsev,et al. Inceptionism: Going Deeper into Neural Networks , 2015 .

[28] Jason Chuang,et al. Human-Centered Interactive Clustering for Data Analysis , 2014 .

[29] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .