Streaming Weak Submodularity: Interpreting Neural Networks on the Fly

In many machine learning applications, it is important to explain the predictions of a black-box classifier. For example, why does a deep neural network assign an image to a particular class? We cast interpretability of black-box classifiers as a combinatorial maximization problem and propose an efficient streaming algorithm to solve it subject to cardinality constraints. By extending ideas from Badanidiyuru et al. [2014], we provide a constant factor approximation guarantee for our algorithm in the case of random stream order and a weakly submodular objective function. This is the first such theoretical guarantee for this general class of functions, and we also show that no such algorithm exists for a worst case stream order. Our algorithm obtains similar explanations of Inception V3 predictions $10$ times faster than the state-of-the-art LIME framework of Ribeiro et al. [2016].

[1]  Laurence A. Wolsey,et al.  Best Algorithms for Approximating the Maximum of a Submodular Set Function , 1978, Math. Oper. Res..

[2]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[3]  Gérard Cornuéjols,et al.  Submodular set functions, matroids and the greedy algorithm: Tight worst-case bounds and some generalizations of the Rado-Edmonds theorem , 1984, Discret. Appl. Math..

[4]  U. Feige A threshold of ln n for approximating set cover , 1998, JACM.

[5]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[6]  J. Vondrák Submodularity and curvature : the optimal algorithm , 2008 .

[7]  Andreas Krause,et al.  Submodular Dictionary Selection for Sparse Representation , 2010, ICML.

[8]  Bhiksha Raj,et al.  Greedy sparsity-constrained optimization , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[9]  Abhimanyu Das,et al.  Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.

[10]  Jan Vondrák,et al.  Maximizing a Monotone Submodular Function Subject to a Matroid Constraint , 2011, SIAM J. Comput..

[11]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Andreas Krause,et al.  Distributed Submodular Maximization: Identifying Representative Elements in Massive Data , 2013, NIPS.

[13]  Pushmeet Kohli,et al.  Tractability: Practical Approaches to Hard Problems , 2013 .

[14]  Francis R. Bach,et al.  Learning with Submodular Functions: A Convex Optimization Perspective , 2011, Found. Trends Mach. Learn..

[15]  Joseph K. Bradley,et al.  Parallel Double Greedy Submodular Maximization , 2014, NIPS.

[16]  Andreas Krause,et al.  Streaming submodular maximization: massive data summarization on the fly , 2014, KDD.

[17]  Andreas Krause,et al.  Submodular Function Maximization , 2014, Tractability.

[18]  Yonina C. Eldar,et al.  Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference , 2015, ArXiv.

[19]  Roy Schwartz,et al.  Online Submodular Maximization with Preemption , 2015, SODA.

[20]  Huy L. Nguyen,et al.  The Power of Randomization: Distributed Submodular Maximization on Massive Datasets , 2015, ICML.

[21]  Kent Quanrud,et al.  Streaming Algorithms for Submodular Function Maximization , 2015, ICALP.

[22]  Rishabh K. Iyer,et al.  Submodularity in Data Subset Selection and Active Learning , 2015, ICML.

[23]  Jan Vondrák,et al.  Optimal approximation for submodular and supermodular optimization with bounded curvature , 2013, SODA.

[24]  Andreas Krause,et al.  Lazier Than Lazy Greedy , 2014, AAAI.

[25]  Huy L. Nguyen,et al.  A New Framework for Distributed Submodular Maximization , 2015, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[26]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[28]  Yaron Singer,et al.  Maximization of Approximately Submodular Functions , 2016, NIPS.

[29]  Niv Buchbinder,et al.  Deterministic Algorithms for Submodular Maximization Problems , 2016, SODA.

[30]  Aditya Bhaskara,et al.  Greedy Column Subset Selection: New Bounds and Distributed Algorithms , 2016, ICML.

[31]  Avinatan Hassidim,et al.  Submodular Optimization under Noise , 2017, COLT.

[32]  Alexandros G. Dimakis,et al.  On Approximation Guarantees for Greedy Low Rank Optimization , 2017, ICML.

[33]  T.-H. Hubert Chan,et al.  Online Submodular Maximization with Free Disposal: Randomization Beats ¼ for Partition Matroids , 2017, SODA.

[34]  Andreas Krause,et al.  Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains , 2016, AISTATS.

[35]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[36]  Alexandros G. Dimakis,et al.  Restricted Strong Convexity Implies Weak Submodularity , 2016, The Annals of Statistics.

[37]  Niv Buchbinder,et al.  Constrained Submodular Maximization via a Non-symmetric Technique , 2016, Math. Oper. Res..