Perceptual Deep Neural Networks: Adversarial Robustness through Input Recreation

Adversarial examples have shown that albeit highly accurate, models learned by machines, differently from humans,have many weaknesses. However, humans' perception is also fundamentally different from machines, because we do not see the signals which arrive at the retina but a rather complex recreation of them. In this paper, we explore how machines could recreate the input as well as investigate the benefits of such an augmented perception. In this regard, we propose Perceptual Deep Neural Networks ($\varphi$DNN) which also recreate their own input before further processing. The concept is formalized mathematically and two variations of it are developed (one based on inpainting the whole image and the other based on a noisy resized super resolution recreation). Experiments reveal that $\varphi$DNNs can reduce attacks' accuracy substantially, surpassing state-of-the-art defenses in 87% of the tests for adversarial training variations and 100% of the tests when only comparing with other pre-processing type of defenses. Moreover, the recreation process intentionally corrupts the input image. Interestingly, we show by ablation tests that corrupting the input is, although counter-intuitive,beneficial. This suggests that the blind-spot in vertebrates might also be, analogously, the precursor of visual robustness. Thus, $\varphi$DNNs reveal that input recreation has strong benefits for artificial neural networks similar to biological ones, shedding light into the importance of the blind-spot and starting an area of perception models for robust recognition in artificial intelligence.

[1]  Logan Engstrom,et al.  Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[2]  C. Pennartz The Brain's Representational Power: On Consciousness and the Integration of Modalities , 2015 .

[3]  Jinfeng Yi,et al.  AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks , 2018, AAAI.

[4]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[5]  Xin Li,et al.  Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[7]  Antonio Torralba,et al.  Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence , 2016, Scientific Reports.

[8]  Yizhen Zhang,et al.  Neural Encoding and Decoding with Deep Learning for Dynamic Natural Vision , 2016, Cerebral cortex.

[9]  H. Komatsu The neural mechanisms of perceptual filling-in , 2006, Nature Reviews Neuroscience.

[10]  Benedikt V Ehinger,et al.  Humans treat unreliable filled-in percepts as more real than veridical ones , 2017, bioRxiv.

[11]  A. Clark Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.

[12]  Li Chen,et al.  Keeping the Bad Guys Out: Protecting and Vaccinating Deep Learning with JPEG Compression , 2017, ArXiv.

[13]  Wei Liu,et al.  Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  K. Makarychev Perturbations , Optimization , and Statistics , 2017 .

[15]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[16]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[17]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[18]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[19]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[20]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[21]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[22]  Matthias Bethge,et al.  Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[23]  Aleksander Madry,et al.  A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations , 2017, ArXiv.

[24]  Haichao Zhang,et al.  Defense Against Adversarial Attacks Using Feature Scattering-based Adversarial Training , 2019, NeurIPS.

[25]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[26]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[27]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[28]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[29]  Peter De Weerd,et al.  Responses of cells in monkey visual cortex during perceptual filling-in of an artificial scotoma , 1995, Nature.

[30]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[31]  Pushmeet Kohli,et al.  Adversarial Robustness through Local Linearization , 2019, NeurIPS.

[32]  Olaf Sporns,et al.  The small world of the cerebral cortex , 2007, Neuroinformatics.

[33]  Dale Schuurmans,et al.  Learning with a Strong Adversary , 2015, ArXiv.

[34]  James Bailey,et al.  Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[35]  Sam Shleifer,et al.  Using Small Proxy Datasets to Accelerate Hyperparameter Search , 2019, ArXiv.

[36]  Harini Kannan,et al.  Adversarial Logit Pairing , 2018, NIPS 2018.

[37]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[38]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[40]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[41]  Eugenio Culurciello,et al.  Deep Predictive Coding Network for Object Recognition , 2018, ICML.

[42]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[43]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.