Improved, Deterministic Smoothing for L1 Certified Robustness

Randomized smoothing is a general technique for computing sample-dependent robustness guarantees against adversarial attacks for deep classifiers. Prior works on randomized smoothing against `1 adversarial attacks use additive smoothing noise and provide probabilistic robustness guarantees. In this work, we propose a non-additive and deterministic smoothing method, Deterministic Smoothing with Splitting Noise (DSSN). To develop DSSN, we first develop SSN, a randomized method which involves generating each noisy smoothing sample by first randomly splitting the input space and then returning a representation of the center of the subdivision occupied by the input sample. In contrast to uniform additive smoothing, the SSN certification does not require the random noise components used to be independent. Thus, smoothing can be done effectively in just one dimension and can therefore be efficiently derandomized for quantized data (e.g., images). To the best of our knowledge, this is the first work to provide deterministic “randomized smoothing” for a norm-based adversarial threat model while allowing for an arbitrary classifier (i.e., a deep model) to be used as a base classifier and without requiring an exponential number of smoothing samples. On CIFAR-10 and ImageNet datasets, we provide substantially larger `1 robustness certificates compared to prior works, establishing a new state-ofthe-art. The determinism of our method also leads to significantly faster certificate computation. Code is available at: https://github.com/ alevine0/smoothingSplittingNoise.

[1]  Cem Anil,et al.  Sorting out Lipschitz function approximation , 2018, ICML.

[2]  Alexander Levine,et al.  Deep Partition Aggregation: Provable Defense against General Poisoning Attacks , 2020, ICLR.

[3]  Chong Xiang,et al.  PatchGuard: Provable Defense against Adversarial Patches Using Masks on Small Receptive Fields , 2020, ArXiv.

[4]  Ilya P. Razenshteyn,et al.  Randomized Smoothing of All Shapes and Sizes , 2020, ICML.

[5]  Alexander Levine,et al.  (De)Randomized Smoothing for Certifiable Defense against Patch Attacks , 2020, NeurIPS.

[6]  Timothy A. Mann,et al.  On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[7]  Alexander Levine,et al.  Robustness Certificates for Sparse Adversarial Attacks by Randomized Ablation , 2019, AAAI.

[8]  Bo Li,et al.  SoK: Certified Robustness for Deep Neural Networks , 2020, ArXiv.

[9]  Suman Jana,et al.  Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[10]  Ce Zhang,et al.  RAB: Provable Robustness Against Backdoor Attacks , 2020, ArXiv.

[11]  T. Goldstein,et al.  Certified Defenses for Adversarial Patches , 2020, ICLR.

[12]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[13]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[14]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[15]  Greg Yang,et al.  Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[16]  Soheil Feizi,et al.  Second-Order Provable Defenses against Adversarial Attacks , 2020, ICML.

[17]  Maximilian Baader,et al.  Certified Defense to Image Transformations via Randomized Smoothing , 2020, NeurIPS.

[18]  Alexander Levine,et al.  Certifiably Robust Interpretation in Deep Learning , 2019, ArXiv.

[19]  Cho-Jui Hsieh,et al.  Efficient Neural Network Robustness Certification with General Activation Functions , 2018, NeurIPS.

[20]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[21]  Zhanyuan Zhang,et al.  Clipped BagNet: Defending Against Sticker Attacks with Clipped Bag-of-features , 2020, 2020 IEEE Security and Privacy Workshops (SPW).

[22]  Tommi S. Jaakkola,et al.  Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers , 2019, NeurIPS.

[23]  Jinfeng Yi,et al.  EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples , 2017, AAAI.

[24]  Pin-Yu Chen,et al.  Higher-Order Certification for Randomized Smoothing , 2020, NeurIPS.

[25]  Mingjie Sun,et al.  Denoised Smoothing: A Provable Defense for Pretrained Classifiers , 2020, NeurIPS.

[26]  Jinwoo Shin,et al.  Consistency Regularization for Certified Robustness of Smoothed Classifiers , 2020, NeurIPS.

[27]  Chun-Shien Lu,et al.  Deterministic Certification to Adversarial Attacks via Bernstein Polynomial Approximation , 2020, ArXiv.

[28]  Soheil Feizi,et al.  Wasserstein Smoothing: Certified Robustness against Wasserstein Adversarial Attacks , 2019, AISTATS.

[29]  J. Z. Kolter,et al.  Certified Robustness to Label-Flipping Attacks via Randomized Smoothing , 2020, ICML.

[30]  Cem Anil,et al.  Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks , 2019, NeurIPS.

[31]  Russ Tedrake,et al.  Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[32]  Guang-He Lee,et al.  $\ell_1$ Adversarial Robustness Certificates: a Randomized Smoothing Approach , 2019 .

[33]  Aditi Raghunathan,et al.  Semidefinite relaxations for certifying robustness to adversarial examples , 2018, NeurIPS.

[34]  Pradeep Ravikumar,et al.  MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius , 2020, ICLR.

[35]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.