Robust Neural Networks inspired by Strong Stability Preserving Runge-Kutta methods

Deep neural networks have achieved state-of-the-art performance in a variety of fields. Recent works observe that a class of widely used neural networks can be viewed as the Euler method of numerical discretization. From the numerical discretization perspective, Strong Stability Preserving (SSP) methods are more advanced techniques than the explicit Euler method that produce both accurate and stable solutions. Motivated by the SSP property and a generalized Runge-Kutta method, we propose Strong Stability Preserving networks (SSP networks) which improve robustness against adversarial attacks. We empirically demonstrate that the proposed networks improve the robustness against adversarial examples without any defensive methods. Further, the SSP networks are complementary with a state-of-the-art adversarial training scheme. Lastly, our experiments show that SSP networks suppress the blow-up of adversarial perturbations. Our results open up a way to study robust architectures of neural networks leveraging rich knowledge from numerical discretization literature.

[1]  Alan L. Yuille,et al.  Feature Denoising for Improving Adversarial Robustness , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Bin Dong,et al.  Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations , 2017, ICML.

[3]  James Bailey,et al.  On the Convergence and Robustness of Adversarial Training , 2021, ICML.

[4]  A. Harten High Resolution Schemes for Hyperbolic Conservation Laws , 2017 .

[5]  Eldad Haber,et al.  Deep Neural Networks Motivated by Partial Differential Equations , 2018, Journal of Mathematical Imaging and Vision.

[6]  Shie Mannor,et al.  Robustness and Regularization of Support Vector Machines , 2008, J. Mach. Learn. Res..

[7]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[8]  David Duvenaud,et al.  Latent ODEs for Irregularly-Sampled Time Series , 2019, ArXiv.

[9]  Dina Katabi,et al.  ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation , 2019, ICML.

[10]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[11]  Chi-Wang Shu Total-variation-diminishing time discretizations , 1988 .

[12]  S. Osher,et al.  Some results on uniformly high-order accurate essentially nonoscillatory schemes , 1986 .

[13]  Dahua Lin,et al.  PolyNet: A Pursuit of Structural Diversity in Very Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[16]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[17]  Chi-Wang Shu,et al.  Total variation diminishing Runge-Kutta schemes , 1998, Math. Comput..

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[20]  Christian Osendorfer,et al.  NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations , 2018, NeurIPS.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[23]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[24]  S. Osher,et al.  Efficient implementation of essentially non-oscillatory shock-capturing schemes,II , 1989 .

[25]  S. Osher,et al.  Uniformly high order accurate essentially non-oscillatory schemes, 111 , 1987 .

[26]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[27]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[28]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[29]  S. Osher,et al.  Uniformly high order accuracy essentially non-oscillatory schemes III , 1987 .

[30]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[31]  Tengyu Ma,et al.  Fixup Initialization: Residual Learning Without Normalization , 2019, ICLR.

[32]  Edward Raff,et al.  Barrage of Random Transforms for Adversarially Robust Defense , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[34]  Fabian Immler,et al.  Numerical Analysis of Ordinary Differential Equations , 2013 .

[35]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[36]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[37]  J. Butcher The numerical analysis of ordinary differential equations: Runge-Kutta and general linear methods , 1987 .

[38]  Laurent El Ghaoui,et al.  Chapter Fourteen. Robust Adjustable Multistage Optimization , 2009 .

[39]  Chi-Wang Shu,et al.  Strong Stability-Preserving High-Order Time Discretization Methods , 2001, SIAM Rev..

[40]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[41]  J. Zico Kolter,et al.  Scaling provable adversarial defenses , 2018, NeurIPS.

[42]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).