CC-Cert: A Probabilistic Approach to Certify General Robustness of Neural Networks

In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks — small modifications of the input that change the predictions. Besides rigorously studied `p-bounded additive perturbations, recently proposed semantic perturbations (e.g. rotation, translation) raise a serious concern on deploying ML systems in real-world. Therefore, it is important to provide provable guarantees for deep learning models against semantically meaningful input transformations. In this paper, we propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds that can be used in general attack settings. We estimate the probability of a model to fail if the attack is sampled from a certain distribution. Our theoretical findings are supported by experimental results on different datasets.

[1]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[2]  Jaakko Lehtinen,et al.  Differentiable Monte Carlo ray tracing through edge sampling , 2018, ACM Trans. Graph..

[3]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[4]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[5]  Cem Anil,et al.  Sorting out Lipschitz function approximation , 2018, ICML.

[6]  Mark S. Squillante,et al.  PROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach , 2018, ICML.

[7]  Dmytro Mishkin,et al.  Kornia: an Open Source Differentiable Computer Vision Library for PyTorch , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[8]  Mislav Balunovic,et al.  Certifying Geometric Robustness of Neural Networks , 2019, NeurIPS.

[9]  Aleksander Madry,et al.  On Adaptive Attacks to Adversarial Example Defenses , 2020, NeurIPS.

[10]  Maximilian Baader,et al.  Certified Defense to Image Transformations via Randomized Smoothing , 2020, NeurIPS.

[11]  Pushmeet Kohli,et al.  A Dual Approach to Scalable Verification of Deep Networks , 2018, UAI.

[12]  Alexander Levine,et al.  Robustness Certificates for Sparse Adversarial Attacks by Randomized Ablation , 2019, AAAI.

[13]  Suman Jana,et al.  Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[14]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[15]  Mingyan Liu,et al.  Spatially Transformed Adversarial Examples , 2018, ICLR.

[16]  Cem Anil,et al.  Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks , 2019, NeurIPS.

[17]  Gábor Lugosi,et al.  Concentration Inequalities , 2008, COLT.

[18]  R. Paley,et al.  On some series of functions, (3) , 1930, Mathematical Proceedings of the Cambridge Philosophical Society.

[19]  Radha Poovendran,et al.  Semantic Adversarial Examples , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[21]  Pin-Yu Chen,et al.  Towards Verifying Robustness of Neural Networks Against A Family of Semantic Perturbations , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Bhavya Kailkhura,et al.  TSS: Transformation-Specific Smoothing for Robustness Certification , 2020, CCS.

[23]  Greg Yang,et al.  Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[24]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[25]  Pin-Yu Chen,et al.  Higher-Order Certification for Randomized Smoothing , 2020, NeurIPS.

[26]  Aditi Raghunathan,et al.  Semidefinite relaxations for certifying robustness to adversarial examples , 2018, NeurIPS.

[27]  Pradeep Ravikumar,et al.  MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius , 2020, ICLR.

[28]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[29]  Chinmay Hegde,et al.  Semantic Adversarial Attacks: Parametric Transformations That Fool Deep Classifiers , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[31]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Geometric Robustness of Deep Networks: Analysis and Improvement , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Bernard Ghanem,et al.  DeformRS: Certifying Input Deformations with Randomized Smoothing , 2021, AAAI.

[33]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[35]  Marc Fischer,et al.  Boosting Randomized Smoothing with Variance Reduced Classifiers , 2021, ICLR.

[36]  E. S. Pearson,et al.  THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .

[37]  Aleksander Madry,et al.  A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations , 2017, ArXiv.

[38]  Chun-Liang Li,et al.  Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer , 2018, ICLR.

[39]  Timothy A. Mann,et al.  On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[40]  Pushmeet Kohli,et al.  A Unified View of Piecewise Linear Neural Network Verification , 2017, NeurIPS.

[41]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[42]  Alessandro Orso,et al.  Robustness of Neural Networks: A Probabilistic and Practical Approach , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER).