论文信息 - Improving Compositionality of Neural Networks by Decoding Representations to Inputs

Improving Compositionality of Neural Networks by Decoding Representations to Inputs

In traditional software programs, it is easy to trace program logic from variables back to input, apply assertion statements to block erroneous behavior, and compose programs together. Although deep learning programs have demonstrated strong performance on novel applications, they sacrifice many of the functionalities of traditional software programs. With this as motivation, we take a modest first step towards improving deep learning programs by jointly training a generative model to constrain neural network activations to “decode” back to inputs. We call this design a Decodable Neural Network, or DecNN. Doing so enables a form of compositionality in neural networks, where one can recursively compose DecNN with itself to create an ensemble-like model with uncertainty. In our experiments, we demonstrate applications of this uncertainty to out-of-distribution detection, adversarial example detection, and calibration — while matching standard neural networks in accuracy. We further explore this compositionality by combining DecNN with pretrained models, where we show promising results that neural networks can be regularized from using protected features.

Mike Wu | Stefano Ermon | Noah Goodman

[1] Bianca Zadrozny,et al. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[2] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[3] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[4] Kevin Gimpel,et al. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[5] Thomas G. Dietterich,et al. Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[6] Jan Hendrik Metzen,et al. On Detecting Adversarial Perturbations , 2017, ICLR.

[7] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[8] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.

[9] Mike Wu,et al. Beyond Sparsity: Tree Regularization of Deep Models for Interpretability , 2017, AAAI.

[10] Yang Song,et al. Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[11] Deborah Silver,et al. Feature Visualization , 1994, Scientific Visualization.

[12] Kristina Lerman,et al. A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..

[13] Yanjun Qi,et al. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[14] Samy Bengio,et al. Density estimation using Real NVP , 2016, ICLR.

[15] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.

[16] Bjorn Ommer,et al. Network-to-Network Translation with Conditional Invertible Neural Networks , 2020, NeurIPS.

[17] Bogdan Raducanu,et al. Invertible Conditional GANs for image editing , 2016, ArXiv.

[18] Ran El-Yaniv,et al. SelectiveNet: A Deep Neural Network with an Integrated Reject Option , 2019, ICML.

[19] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[20] Christopher Ré. Software 2.0 and Snorkel: Beyond Hand-Labeled Data , 2018, KDD.

[21] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[22] Stefano Ermon,et al. Learning Controllable Fair Representations , 2018, AISTATS.

[23] Justin Gilmer,et al. MNIST-C: A Robustness Benchmark for Computer Vision , 2019, ArXiv.

[24] Yoshua Bengio,et al. Speech Model Pre-training for End-to-End Spoken Language Understanding , 2019, INTERSPEECH.

[25] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[26] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[27] Andrew Slavin Ross,et al. Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients , 2017, AAAI.

[28] Julien Cornebise,et al. Weight Uncertainty in Neural Network , 2015, ICML.

[29] Yoshua Bengio,et al. Understanding intermediate layers using linear classifier probes , 2016, ICLR.

[30] Bianca Zadrozny,et al. Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[31] Kyunghyun Cho,et al. A Framework For Contrastive Self-Supervised Learning And Designing A New Approach , 2020, ArXiv.

[32] Andrea Vedaldi,et al. Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Eric Nalisnick,et al. Normalizing Flows for Probabilistic Modeling and Inference , 2019, J. Mach. Learn. Res..

[34] Jun Zhu,et al. Towards Robust Detection of Adversarial Examples , 2017, NeurIPS.

[35] Ryan R. Curtin,et al. Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[36] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[37] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[38] D. Sculley,et al. Hidden Technical Debt in Machine Learning Systems , 2015, NIPS.

[39] Rich Caruana,et al. Predicting good probabilities with supervised learning , 2005, ICML.

[40] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[41] Bolei Zhou,et al. Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42] Renato Renner,et al. An intuitive proof of the data processing inequality , 2011, Quantum Inf. Comput..

[43] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[44] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[45] Sean Tao,et al. Deep Neural Network Ensembles , 2019, LOD.

[46] R. Srikant,et al. Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[47] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[48] Milos Hauskrecht,et al. Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.

[49] Kibok Lee,et al. A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[50] Dawn Song,et al. Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty , 2019, NeurIPS.

[51] Pascal Vincent,et al. Visualizing Higher-Layer Features of a Deep Network , 2009 .

[52] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Ran El-Yaniv,et al. Selective Classification for Deep Neural Networks , 2017, NIPS.

[54] Jiaming Song,et al. Denoising Diffusion Implicit Models , 2021, ICLR.

[55] Mike Wu,et al. Viewmaker Networks: Learning Views for Unsupervised Representation Learning , 2020, ArXiv.

[56] Klaus-Robert Müller,et al. Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals , 2018, ArXiv.

[57] Chandramouli Shama Sastry,et al. Detecting Out-of-Distribution Examples with In-distribution Examples and Gram Matrices , 2019, ArXiv.

[58] Ev Zisselman,et al. Deep Residual Flow for Out of Distribution Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59] Iain Murray,et al. Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[60] Jacob Andreas,et al. Compositional Explanations of Neurons , 2020, NeurIPS.

[61] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.