Invert and Defend: Model-based Approximate Inversion of Generative Adversarial Networks for Secure Inference

Inferring the latent variable generating a given test sample is a challenging problem in Generative Adversarial Networks (GANs). In this paper, we propose InvGAN - a novel framework for solving the inference problem in GANs, which involves training an encoder network capable of inverting a pre-trained generator network without access to any training data. Under mild assumptions, we theoretically show that using InvGAN, we can approximately invert the generations of any latent code of a trained GAN model. Furthermore, we empirically demonstrate the superiority of our inference scheme by quantitative and qualitative comparisons with other methods that perform a similar task. We also show the effectiveness of our framework in the problem of adversarial defenses where InvGAN can successfully be used as a projection-based defense mechanism. Additionally, we show how InvGAN can be used to implement reparameterization white-box attacks on projection-based defense mechanisms. Experimental validation on several benchmark datasets demonstrate the efficacy of our method in achieving improved performance on several white-box and black-box attacks. Our code is available at this https URL.

[1]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[2]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[3]  Alexandros G. Dimakis,et al.  Compressed Sensing using Generative Models , 2017, ICML.

[4]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[5]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[7]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[8]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[11]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[12]  Fei Wang,et al.  Differentially Private Generative Adversarial Network , 2018, ArXiv.

[13]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[14]  Andrea Vedaldi,et al.  It Takes (Only) Two: Adversarial Generator-Encoder Networks , 2017, AAAI.

[15]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[16]  Subarna Tripathi,et al.  Precise Recovery of Latent Vectors from Generative Adversarial Networks , 2017, ICLR.

[17]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[18]  Anil A. Bharath,et al.  Inverting the Generator of a Generative Adversarial Network , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[19]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[20]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[21]  Paulina Grnarova,et al.  Defending Against Adversarial Attacks by Leveraging an Entire GAN , 2018, ArXiv.

[22]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[23]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[24]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[25]  Georg Langs,et al.  Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.

[26]  Ian J. Goodfellow,et al.  Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .

[27]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[28]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[29]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[31]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[32]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.