Don't Paint It Black: White-Box Explanations for Deep Learning in Computer Security

Deep learning is increasingly used as a basic building block of security systems. Unfortunately, deep neural networks are hard to interpret, and their decision process is opaque to the practitioner. Recent work has started to address this problem by considering black-box explanations for deep learning in computer security (CCS'18). The underlying explanation methods, however, ignore the structure of neural networks and thus omit crucial information for analyzing the decision process. In this paper, we investigate white-box explanations and systematically compare them with current black-box approaches. In an extensive evaluation with learning-based systems for malware detection and vulnerability discovery, we demonstrate that white-box explanations are more concise, sparse, complete and efficient than black-box approaches. As a consequence, we generally recommend the use of white-box explanations if access to the employed neural network is available, which usually is the case for stand-alone systems for malware detection, binary analysis, and vulnerability discovery.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Adam Doupé,et al.  Deep Android Malware Detection , 2017, CODASPY.

[3]  Stephen P. Boyd,et al.  CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..

[4]  Klaus-Robert Müller,et al.  "What is relevant in a text document?": An interpretable machine learning approach , 2016, PloS one.

[5]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[6]  Dan Arp,et al.  Drebin : � Efficient and Explainable Detection of Android Malware in Your Pocket , 2014 .

[7]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[8]  Yarin Gal,et al.  Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.

[9]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[10]  Pavel Laskov,et al.  Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[11]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[12]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[13]  Shouhuai Xu,et al.  VulDeePecker: A Deep Learning-Based System for Vulnerability Detection , 2018, NDSS.

[15]  Cengiz Öztireli,et al.  Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.

[16]  Zhenkai Liang,et al.  Neural Nets Can Learn Function Type Signatures From Binaries , 2017, USENIX Security Symposium.

[17]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[18]  Le Song,et al.  Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection , 2018 .

[19]  Klaus-Robert Müller,et al.  Learning how to explain neural networks: PatternNet and PatternAttribution , 2017, ICLR.

[20]  Patrick D. McDaniel,et al.  Adversarial Examples for Malware Detection , 2017, ESORICS.

[21]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Christopher Krügel,et al.  Revolver: An Automated Approach to the Detection of Evasive Web-based Malware , 2013, USENIX Security Symposium.

[23]  Wenyi Huang,et al.  MtNet: A Multi-Task Neural Network for Dynamic Malware Classification , 2016, DIMVA.

[24]  Angelos Stavrou,et al.  Malicious PDF detection using metadata and structural features , 2012, ACSAC '12.

[25]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[26]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[27]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[28]  Gang Wang,et al.  LEMNA: Explaining Deep Learning based Security Applications , 2018, CCS.

[29]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[30]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[33]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[34]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[35]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[36]  Michael P. Wellman,et al.  SoK: Security and Privacy in Machine Learning , 2018, 2018 IEEE European Symposium on Security and Privacy (EuroS&P).

[37]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Dawn Xiaodong Song,et al.  Recognizing Functions in Binaries with Neural Networks , 2015, USENIX Security Symposium.

[39]  Nicholas Carlini,et al.  Is AmI (Attacks Meet Interpretability) Robust to Adversarial Examples? , 2019, ArXiv.

[40]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[41]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.