Understanding Global Feature Contributions Through Additive Importance Measures

Understanding the inner workings of complex machine learning models is a long-standing problem, with recent research focusing primarily on local interpretability. To assess the role of individual input features in a global sense, we propose a new feature importance method, Shapley Additive Global importancE (SAGE), a model-agnostic measure of feature importance based on the predictive power associated with each feature. SAGE relates to prior work through the novel framework of additive importance measures, a perspective that unifies numerous other feature importance methods and shows that only SAGE properly accounts for complex feature interactions. We define SAGE using the Shapley value from cooperative game theory, which leads to numerous intuitive and desirable properties. Our experiments apply SAGE to eight datasets, including MNIST and breast cancer subtype classification, and demonstrate its advantages through quantitative and qualitative evaluations.

[1]  Tao Xiong,et al.  Sensitivity based Neural Networks Explanations , 2018, ArXiv.

[2]  S. Chin,et al.  BCL11A is a triple-negative breast cancer gene with critical functions in stem and progenitor cells , 2015, Nature Communications.

[3]  Art B. Owen,et al.  On Shapley Value for Measuring Importance of Dependent Inputs , 2016, SIAM/ASA J. Uncertain. Quantification.

[4]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[5]  S. Lipovetsky,et al.  Analysis of regression in game theory approach , 2001 .

[6]  Barry L. Nelson,et al.  Shapley Effects for Global Sensitivity Analysis: Theory and Computation , 2016, SIAM/ASA J. Uncertain. Quantification.

[7]  Anna Veronika Dorogush,et al.  CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.

[8]  R. Detrano,et al.  International application of a new probability algorithm for the diagnosis of coronary artery disease. , 1989, The American journal of cardiology.

[9]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[10]  Alessandro Rinaldo,et al.  Distribution-Free Predictive Inference for Regression , 2016, Journal of the American Statistical Association.

[11]  L. Shapley A Value for n-person Games , 1988 .

[12]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[13]  Yan Guo,et al.  Powerful Bivariate Genome-Wide Association Analyses Suggest the SOX6 Gene Influencing Both Obesity and Osteoporosis Phenotypes in Males , 2009, PloS one.

[14]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[15]  Clement Adebamowo,et al.  A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers. , 2018, Cancer cell.

[16]  Art B. Owen,et al.  Sobol' Indices and Shapley Value , 2014, SIAM/ASA J. Uncertain. Quantification.

[17]  Amrita Cheema,et al.  The mitochondrial citrate carrier, SLC25A1, drives stemness and therapy resistance in non-small cell lung cancer , 2018, Cell Death & Differentiation.

[18]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[21]  Nazih Benoumechiara,et al.  Shapley effects for sensitivity analysis with dependent inputs: bootstrap and kriging-based algorithms , 2018, ESAIM: Proceedings and Surveys.

[22]  Yomi Kastro,et al.  Real-time prediction of online shoppers’ purchasing intention using multilayer perceptron and LSTM recurrent neural networks , 2019, Neural Computing and Applications.

[23]  Giles Hooker,et al.  Please Stop Permuting Features: An Explanation and Alternatives , 2019, ArXiv.

[24]  K. Tomczak,et al.  The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge , 2015, Contemporary oncology.

[25]  Kjersti Aas,et al.  Explaining individual predictions when features are dependent: More accurate approximations to Shapley values , 2019, Artif. Intell..

[26]  Stefano Tarantola,et al.  Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models , 2004 .

[27]  Erik Strumbelj,et al.  An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..

[28]  Erik Strumbelj,et al.  Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.

[29]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[30]  Hadi Fanaee-T,et al.  Event labeling combining ensemble detectors and background knowledge , 2014, Progress in Artificial Intelligence.

[31]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[32]  Le Song,et al.  L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data , 2018, ICLR.

[33]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[34]  Dan Roth,et al.  Understanding the Value of Features for Coreference Resolution , 2008, EMNLP.

[35]  Paulo Cortez,et al.  Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..

[36]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[37]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[38]  A. Charnes,et al.  Extremal Principle Solutions of Games in Characteristic Function Form: Core, Chebychev and Shapley Value Generalizations , 1988 .

[39]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[40]  Hugh Chen,et al.  From local explanations to global understanding with explainable AI for trees , 2020, Nature Machine Intelligence.

[41]  Dominik Janzing,et al.  Feature relevance quantification in explainable AI: A causality problem , 2019, AISTATS.

[42]  Huan Liu,et al.  Neural-network feature selector , 1997, IEEE Trans. Neural Networks.