论文信息 - Towards Aggregating Weighted Feature Attributions

Towards Aggregating Weighted Feature Attributions

Current approaches for explaining machine learning models fall into two distinct classes: antecedent event influence and value attribution. The former leverages training instances to describe how much influence a training point exerts on a test point, while the latter attempts to attribute value to the features most pertinent to a given prediction. In this work, we discuss an algorithm, AVA: Aggregate Valuation of Antecedents, that fuses these two explanation classes to form a new approach to feature attribution that not only retrieves local explanations but also captures global patterns learned by a model. Our experimentation convincingly favors weighting and aggregating feature attributions via AVA.

José M. F. Moura | Pradeep Ravikumar | Umang Bhatt

[1] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[2] Yukihiko Funaki,et al. A new basis and the Shapley value , 2016, Math. Soc. Sci..

[3] L. S. Shapley,et al. 17. A Value for n-Person Games , 1953 .

[4] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[5] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.

[6] H. Simon,et al. What is an “Explanation” of Behavior? , 1992 .

[7] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.

[8] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[9] E. Kalai,et al. On weighted Shapley values , 1983 .