Transparency Promotion with Model-Agnostic Linear Competitors

We propose a novel type of hybrid model for multi-class classification, which utilizes competing linear models to collaborate with an existing black-box model, promoting transparency in the decision-making process. Our proposed hybrid model, Model-Agnostic Linear Competitors (MALC), brings together the interpretable power of linear models and the good predictive performance of the state-of-the-art black-box models. We formulate the training of a MALC model as a convex optimization problem, optimizing the predictive accuracy and transparency (defined as the percentage of data captured by the linear models) in the objective function. Experiments show that MALC offers more model flexibility for users to balance transparency and accuracy, in contrast to the currently available choice of either a pure black-box model or a pure interpretable model. The human evaluation also shows that more users are likely to choose MALC for this model flexibility compared with interpretable models and black-box models.

[1]  Rutvija Pandya,et al.  C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning , 2015 .

[2]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[3]  Luc De Raedt,et al.  Neural-Symbolic Learning and Reasoning: Contributions and Challenges , 2015, AAAI Spring Symposia.

[4]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[5]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[8]  Zhongsheng Hua,et al.  A hybrid support vector machines and logistic regression approach for forecasting intermittent demand of spare parts , 2006, Appl. Math. Comput..

[9]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[10]  Jiawei Han,et al.  I Know You'll Be Back: Interpretable New User Clustering and Churn Prediction on a Mobile Social Application , 2018, KDD.

[11]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[12]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[13]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[14]  Jialei Wang,et al.  Trading Interpretability for Accuracy: Oblique Treed Sparse Additive Models , 2015, KDD.

[15]  Tommi S. Jaakkola,et al.  On the Robustness of Interpretability Methods , 2018, ArXiv.

[16]  Andreas Ziegler,et al.  ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R , 2015, 1508.04409.

[17]  Cynthia Rudin,et al.  Interpretable classification models for recidivism prediction , 2015, 1503.07810.

[18]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[19]  Yurii Nesterov,et al.  Smoothing Technique and its Applications in Semidefinite Optimization , 2004, Math. Program..

[20]  Marie-Jeanne Lesot,et al.  The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations , 2019, IJCAI.

[21]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[22]  Stefan Wermter,et al.  Hybrid neural systems: from simple coupling to fully integrated neural networks , 1999 .

[23]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[24]  Susan Craw,et al.  Case-Based Reasoning , 2010, Encyclopedia of Machine Learning.

[25]  Sanjeeb Dash,et al.  Generalized Linear Rule Models , 2019, ICML.

[26]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[27]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[28]  R. Aubert,et al.  Is there a relationship between early statin compliance and a reduction in healthcare utilization? , 2010, The American journal of managed care.

[29]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[30]  Jure Leskovec,et al.  Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[31]  Andrew Slavin Ross,et al.  Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.

[32]  Joydeep Ghosh,et al.  Symbolic Interpretation of Artificial Neural Networks , 1999, IEEE Trans. Knowl. Data Eng..

[33]  Cynthia Rudin,et al.  Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model , 2015, ArXiv.

[34]  Pat Langley,et al.  Crafting Papers on Machine Learning , 2000, ICML.

[35]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[36]  Cynthia Rudin,et al.  An Interpretable Model with Globally Consistent Explanations for Credit Risk , 2018, ArXiv.

[37]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[38]  Sang-Chan Park,et al.  A hybrid approach of neural network and memory-based learning to data mining , 2000, IEEE Trans. Neural Networks Learn. Syst..

[39]  Sébastien Gambs,et al.  Fairwashing: the risk of rationalization , 2019, ICML.

[40]  Tong Wang,et al.  Gaining Free or Low-Cost Transparency with Interpretable Partial Substitute , 2018, 1802.04346.

[41]  Tong Wang,et al.  Multi-value Rule Sets for Interpretable Classification with Feature-Efficient Representations , 2018, NeurIPS.

[42]  Cynthia Rudin,et al.  A Bayesian Framework for Learning Rule Sets for Interpretable Classification , 2017, J. Mach. Learn. Res..

[43]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[44]  Eric P. Xing,et al.  Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.