Local Competition and Stochasticity for Adversarial Robustness in Deep Learning

This work addresses adversarial robustness in deep learning by considering deep networks with stochastic local winner-takes-all (LWTA) nonlinearities. This type of network units result in sparse representations from each model layer, as the units are organized in blocks where only one unit generates nonzero output. The main operating principle of the introduced units lies on stochastic arguments, as the network performs posterior sampling over competing units to select the winner. We combine these LWTA arguments with tools from the field of Bayesian nonparametrics, specifically the stick-breaking construction of the Indian Buffet Process, to allow for inferring the sub-part of each layer that is essential for modeling the data at hand. Inference for the proposed network is performed by means of stochastic variational Bayes. We perform a thorough experimental evaluation of our model using benchmark datasets, assuming gradient-based adversarial attacks. As we show, our method achieves high robustness to adversarial perturbations, with state-of-the-art performance in powerful white-box attacks.

[1]  Sergios Theodoridis,et al.  Machine Learning: A Bayesian and Optimization Perspective , 2015 .

[2]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[3]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[4]  S. Grossberg Contour Enhancement , Short Term Memory , and Constancies in Reverberating Neural Networks , 1973 .

[5]  T. Lømo,et al.  Participation of inhibitory and excitatory interneurones in the control of hippocampal cortical output. , 1969, UCLA forum in medical sciences.

[6]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[8]  P. Kumaraswamy A generalized probability density function for double-bounded random processes , 1980 .

[9]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[10]  Jürgen Schmidhuber,et al.  Compete to Compute , 2013, NIPS.

[11]  Xin He,et al.  Simple Physical Adversarial Examples against End-to-End Autonomous Driving Models , 2019, 2019 IEEE International Conference on Embedded Software and Systems (ICESS).

[12]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[13]  A. Lansner Associative memory models: from the cell-assembly theory to biophysically detailed cortex simulations , 2009, Trends in Neurosciences.

[14]  Alexandros G. Dimakis,et al.  The Robust Manifold Defense: Adversarial Training using Generative Models , 2017, ArXiv.

[15]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[16]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[17]  Andrew L. Beam,et al.  Adversarial attacks on medical machine learning , 2019, Science.

[18]  Pascal Frossard,et al.  Optimism in the Face of Adversity: Understanding and Improving Deep Learning Through Adversarial Robustness , 2021, Proceedings of the IEEE.

[19]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[20]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Hold me tight! Influence of discriminative features on deep network boundaries , 2020, NeurIPS.

[21]  Alan L. Yuille,et al.  Mitigating adversarial effects through randomization , 2017, ICLR.

[22]  Anh Nguyen,et al.  VectorDefense: Vectorization as a Defense to Adversarial Examples , 2018, Studies in Computational Intelligence.

[23]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[24]  James A. Storer,et al.  Deflecting Adversarial Attacks with Pixel Deflection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Kamyar Azizzadenesheli,et al.  Stochastic Activation Pruning for Robust Adversarial Defense , 2018, ICLR.

[26]  R. Douglas,et al.  Neuronal circuits of the neocortex. , 2004, Annual review of neuroscience.

[27]  Stephen Grossberg,et al.  The ART of adaptive pattern recognition by a self-organizing neural network , 1988, Computer.

[28]  Sergios Theodoridis,et al.  Nonparametric Bayesian Deep Networks with Local Competition , 2018, ICML.

[29]  James Bailey,et al.  Black-box Adversarial Attacks on Video Recognition Models , 2019, ACM Multimedia.

[30]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[31]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[33]  Stefano Soatto,et al.  Empirical Study of the Topology and Geometry of Deep Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Yongdong Zhang,et al.  APE-GAN: Adversarial Perturbation Elimination with GAN , 2017, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35]  Seyed-Mohsen Moosavi-Dezfooli,et al.  The Robustness of Deep Networks: A Geometrical Perspective , 2017, IEEE Signal Processing Magazine.

[36]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[37]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[38]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[39]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[40]  Ananthram Swami,et al.  Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural Networks , 2019, NeurIPS.

[41]  C. Stefanis Interneuronal mechanisms in the cortex. , 1969, UCLA forum in medical sciences.

[42]  Inderjit S. Dhillon,et al.  The Limitations of Adversarial Training and the Blind-Spot Attack , 2019, ICLR.

[43]  Jianxiong Xiao,et al.  DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[44]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.