论文信息 - Realistic Adversarial Examples in 3D Meshes

Realistic Adversarial Examples in 3D Meshes

Highly expressive models such as deep neural networks (DNNs) have been widely applied to various applications and achieved increasing success. However, recent studies show that such machine learning models appear to be vulnerable against adversarial examples. So far adversarial examples have been heavily explored for 2D images, while few works have conducted to understand vulnerabilities of 3D objects which exist in real world, where 3D objects are projected to 2D domains by photo taking for different learning (recognition) tasks. In this paper, we consider adversarial behaviors in practical scenarios by manipulating the shape and texture of a given 3D mesh representation of an object. Our goal is to project the optimized "adversarial meshes" to 2D with a photorealistic renderer, and still able to mislead different machine learning models. Extensive experiments show that by generating unnoticeable 3D adversarial perturbation on shape or texture for a 3D mesh, the corresponding projected 2D instance can either lead classifiers to misclassify the victim object as an arbitrary malicious target, or hide any target object within the scene from object detectors. We conduct human studies to show that our optimized adversarial 3D perturbation is highly unnoticeable for human vision systems. In addition to the subtle perturbation for a given 3D mesh, we also propose to synthesize a realistic 3D mesh and put in a scene mimicking similar rendering conditions and therefore attack different machine learning models. In-depth analysis of transferability among various 3D renderers and vulnerable regions of meshes are provided to help better understand adversarial behaviors in real-world.

[1] Vladlen Koltun,et al. Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[2] Chun-Liang Li,et al. Adversarial Geometry and Lighting using a Differentiable Renderer , 2018, ArXiv.

[3] Yannick Hold-Geoffroy,et al. Deep Outdoor Illumination Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Jia Deng,et al. Shape from Shading Through Shape Evolution , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Zhenhua Wang,et al. Synthesizing Training Images for Boosting Human 3D Pose Estimation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[7] Geoffrey Zweig,et al. Recent advances in deep learning for speech research at Microsoft , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Cordelia Schmid,et al. Learning from Synthetic Humans , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Mingyan Liu,et al. Spatially Transformed Adversarial Examples , 2018, ICLR.

[11] Yong-Liang Yang,et al. RenderNet: A deep convolutional network for differentiable rendering from 3D shapes , 2018, NeurIPS.

[12] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[13] Mingyan Liu,et al. Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.

[14] Shie Mannor,et al. A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[15] Dawn Song,et al. Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[16] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[17] Pat Hanrahan,et al. An efficient representation for irradiance environment maps , 2001, SIGGRAPH.

[18] Stefan Leutenegger,et al. SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19] Ali Farhadi,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.

[20] Alexander Wilkie,et al. An analytic model for full spectral sky-dome radiance , 2012, ACM Trans. Graph..

[21] William T. Freeman,et al. Unsupervised Training for 3D Morphable Model Regression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22] Dawn Song,et al. Physical Adversarial Examples for Object Detectors , 2018, WOOT @ USENIX Security Symposium.

[23] Ersin Yumer,et al. Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Jitendra Malik,et al. Shape, Illumination, and Reflectance from Shading , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25] Donald P. Greenberg,et al. A radiosity method for non-diffuse environments , 1986, SIGGRAPH.

[26] Michael J. Black,et al. OpenDR: An Approximate Differentiable Renderer , 2014, ECCV.

[27] Mingyan Liu,et al. Characterizing Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation , 2018, ECCV.

[28] Mathieu Aubry,et al. Deep Exemplar 2D-3D Detection by Adapting from Real to Rendered Views , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Jianxiong Xiao,et al. 3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Curtis R. Vogel,et al. Iterative Methods for Total Variation Denoising , 1996, SIAM J. Sci. Comput..

[31] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[32] Paolo Cignoni,et al. MeshLab: an Open-Source Mesh Processing Tool , 2008, Eurographics Italian Chapter Conference.

[33] Thomas A. Funkhouser,et al. Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[35] Silvio Savarese,et al. Beyond PASCAL: A benchmark for 3D object detection in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[36] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37] Logan Engstrom,et al. Synthesizing Robust Adversarial Examples , 2017, ICML.

[38] Leonidas J. Guibas,et al. Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Tatsuya Harada,et al. Neural 3D Mesh Renderer , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41] Ananthram Swami,et al. The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[42] Nicola Pezzotti,et al. Differentiable Image Parameterizations , 2018, Distill.

[43] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[44] Roberto Cipolla,et al. Understanding RealWorld Indoor Scenes with Synthetic Data , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Chenxi Liu,et al. Adversarial Attacks Beyond the Image Space , 2017, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[47] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[48] James T. Kajiya,et al. The rendering equation , 1998 .

[49] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[50] Marc Levoy,et al. Zippered polygon meshes from range images , 1994, SIGGRAPH.

[51] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.