Hybrid Decision Making: When Interpretable Models Collaborate With Black-Box Models

Interpretable machine learning models have received increasing interest in recent years, especially in domains where humans are involved in the decision-making process. However, the possible loss of the task performance for gaining interpretability is often inevitable. This performance downgrade puts practitioners in a dilemma of choosing between a top-performing black-box model with no explanations and an interpretable model with unsatisfying task performance. In this work, we propose a novel framework for building a Hybrid Decision Model that integrates an interpretable model with any black-box model to introduce explanations in the decision making process while preserving or possibly improving the predictive accuracy. We propose a novel metric, explainability, to measure the percentage of data that are sent to the interpretable model for decision. We also design a principled objective function that considers predictive accuracy, model interpretability, and data explainability. Under this framework, we develop Collaborative Black-box and RUle Set Hybrid (CoBRUSH) model that combines logic rules and any black-box model into a joint decision model. An input instance is first sent to the rules for decision. If a rule is satisfied, a decision will be directly generated. Otherwise, the black-box model is activated to decide on the instance. To train a hybrid model, we design an efficient search algorithm that exploits theoretically grounded strategies to reduce computation. Experiments show that CoBRUSH models are able to achieve same or better accuracy than their black-box collaborator working alone while gaining explainability. They also have smaller model complexity than interpretable baselines.

[1]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[2]  Joydeep Ghosh,et al.  Symbolic Interpretation of Artificial Neural Networks , 1999, IEEE Trans. Knowl. Data Eng..

[3]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[4]  Luc De Raedt,et al.  Neural-Symbolic Learning and Reasoning: Contributions and Challenges , 2015, AAAI Spring Symposia.

[5]  Jure Leskovec,et al.  Interpretable & Explorable Approximations of Black Box Models , 2017, ArXiv.

[6]  Eric P. Xing,et al.  Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.

[7]  Cynthia Rudin,et al.  A Hierarchical Model for Association Rule Mining of Sequential Events: An Approach to Automated Medical Symptom Prediction , 2011 .

[8]  Hian Chye Koh,et al.  A Two-step Method to Construct Credit Scoring Models with Data Mining Techniques , 2006 .

[9]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[10]  Andrew Slavin Ross,et al.  Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.

[11]  Suresh Venkatasubramanian,et al.  Auditing Black-Box Models for Indirect Influence , 2016, ICDM.

[12]  Jan A. Kors,et al.  Finding a short and accurate decision rule in disjunctive normal form by exhaustive search , 2010, Machine Learning.

[13]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[14]  Krysia Broda,et al.  Neural-symbolic learning systems - foundations and applications , 2012, Perspectives in neural computing.

[15]  I-Cheng Yeh,et al.  The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients , 2009, Expert Syst. Appl..

[16]  Sanjeeb Dash,et al.  Boolean Decision Rules via Column Generation , 2018, NeurIPS.

[17]  Zhongsheng Hua,et al.  A hybrid support vector machines and logistic regression approach for forecasting intermittent demand of spare parts , 2006, Appl. Math. Comput..

[18]  Cynthia Rudin,et al.  Supersparse linear integer models for optimized medical scoring systems , 2015, Machine Learning.

[19]  Stefan Wermter,et al.  Hybrid neural systems: from simple coupling to fully integrated neural networks , 1999 .

[20]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[21]  Isabelle Bichindaritz,et al.  Case-based reasoning in the health sciences: What's next? , 2006, Artif. Intell. Medicine.

[22]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[23]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[24]  Jialei Wang,et al.  Trading Interpretability for Accuracy: Oblique Treed Sparse Additive Models , 2015, KDD.

[25]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[26]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[27]  Sang-Chan Park,et al.  A hybrid approach of neural network and memory-based learning to data mining , 2000, IEEE Trans. Neural Networks Learn. Syst..

[28]  Tong Wang,et al.  Hybrid Predictive Model: When an Interpretable Model Collaborates with a Black-box Model , 2019, ArXiv.

[29]  Cynthia Rudin,et al.  The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[30]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[31]  Margo I. Seltzer,et al.  Learning Certifiably Optimal Rule Lists , 2017, KDD.

[32]  Finale Doshi-Velez,et al.  A Roadmap for a Rigorous Science of Interpretability , 2017, ArXiv.

[33]  Markus Reischl,et al.  Data mining tools , 2011, WIREs Data Mining Knowl. Discov..

[34]  Joy D. Osofsky,et al.  The effects of exposure to violence on young children. , 1995 .

[35]  Jure Leskovec,et al.  Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[36]  Chuang Gan,et al.  Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding , 2018, NeurIPS.

[37]  Cynthia Rudin,et al.  Bayesian Rule Sets for Interpretable Classification , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[38]  Tong Wang,et al.  Gaining Free or Low-Cost Transparency with Interpretable Partial Substitute , 2018, 1802.04346.

[39]  Margo I. Seltzer,et al.  Scalable Bayesian Rule Lists , 2016, ICML.

[40]  Cynthia Rudin,et al.  A Bayesian Framework for Learning Rule Sets for Interpretable Classification , 2017, J. Mach. Learn. Res..

[41]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[42]  Yan Liu,et al.  Detecting Statistical Interactions from Neural Network Weights , 2017, ICLR.

[43]  Francisco Herrera,et al.  Interpretability of linguistic fuzzy rule-based systems: An overview of interpretability measures , 2011, Inf. Sci..

[44]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..