Interpretable Companions for Black-Box Models

We present an interpretable companion model for any pre-trained black-box classifiers. The idea is that for any input, a user can decide to either receive a prediction from the black-box model, with high accuracy but no explanations, or employ a companion rule to obtain an interpretable prediction with slightly lower accuracy. The companion model is trained from data and the predictions of the black-box model, with the objective combining area under the transparency--accuracy curve and model complexity. Our model provides flexible choices for practitioners who face the dilemma of choosing between always using interpretable models and always using black-box models for a predictive task, so users can, for any given input, take a step back to resort to an interpretable prediction if they find the predictive performance satisfying, or stick to the black-box model if the rules are unsatisfying. To show the value of companion models, we design a human evaluation on more than a hundred people to investigate the tolerable accuracy loss to gain interpretability for humans.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[3]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[4]  Jialei Wang,et al.  Trading Interpretability for Accuracy: Oblique Treed Sparse Additive Models , 2015, KDD.

[5]  Tong Wang,et al.  Gaining Free or Low-Cost Interpretability with Interpretable Partial Substitute , 2019, ICML.

[6]  Marie-José Huguet,et al.  Learning Fair Rule Lists , 2019, ArXiv.

[7]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[9]  Sébastien Gambs,et al.  Fairwashing: the risk of rationalization , 2019, ICML.

[10]  Tong Wang,et al.  Hybrid Predictive Model: When an Interpretable Model Collaborates with a Black-box Model , 2019, ArXiv.

[11]  Franco Turini,et al.  Local Rule-Based Explanations of Black Box Decision Systems , 2018, ArXiv.

[12]  Venkatesh Saligrama,et al.  Adaptive Classification for Prediction Under a Budget , 2017, NIPS.

[13]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[14]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[15]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[16]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[17]  Jure Leskovec,et al.  Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[19]  Sanjeeb Dash,et al.  Boolean Decision Rules via Column Generation , 2018, NeurIPS.

[20]  Margo I. Seltzer,et al.  Learning Certifiably Optimal Rule Lists , 2017, KDD.

[21]  Tong Wang,et al.  Multi-value Rule Sets for Interpretable Classification with Feature-Efficient Representations , 2018, NeurIPS.

[22]  Margo I. Seltzer,et al.  Scalable Bayesian Rule Lists , 2016, ICML.

[23]  Cynthia Rudin,et al.  A Bayesian Framework for Learning Rule Sets for Interpretable Classification , 2017, J. Mach. Learn. Res..

[24]  Cynthia Rudin,et al.  Interpretable classification models for recidivism prediction , 2015, 1503.07810.