Defending Bit-Flip Attack through DNN Weight Reconstruction

Recent studies show that adversarial attacks on neural network weights, aka, Bit-Flip Attack (BFA), can degrade Deep Neural Network’s (DNN) prediction accuracy severely. In this work, we propose a novel weight reconstruction method as a countermeasure to such BFAs. Specifically, during inference, the weights are reconstructed such that the weight perturbation due to BFA is minimized or diffused to the neighboring weights. We have successfully demonstrated that our method can significantly improve the DNN robustness against random and gradient-based BFA variants. Even under the most aggressive attacks (i.e., greedy progressive bit search), our method maintains a test accuracy of 60% on ImageNet after 5 iterations while the baseline accuracy drops to below 1%.

[1]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[2]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[3]  Tudor Dumitras,et al.  Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks , 2019, USENIX Security Symposium.

[4]  Luc Van Gool,et al.  Failure Prediction for Autonomous Driving , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[5]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[6]  Herbert Bos,et al.  Exploiting Correcting Codes: On the Effectiveness of ECC Memory Against Rowhammer Attacks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[7]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[8]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[9]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[10]  Yuval Yarom,et al.  Another Flip in the Wall of Rowhammer Defenses , 2017, 2018 IEEE Symposium on Security and Privacy (SP).

[11]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[12]  Deliang Fan,et al.  Bit-Flip Attack: Crushing Neural Network With Progressive Bit Search , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Qiang Xu,et al.  Fault injection attack on deep neural network , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[14]  Muhammad Shafique,et al.  Robust Machine Learning Systems: Reliability and Security for Deep Neural Networks , 2018, 2018 IEEE 24th International Symposium on On-Line Testing And Robust System Design (IOLTS).

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Chaitali Chakrabarti,et al.  Improving Reliability of ReRAM-Based DNN Implementation through Novel Weight Distribution , 2019, 2019 IEEE International Workshop on Signal Processing Systems (SiPS).

[17]  Frederic T. Chong,et al.  Protecting Page Tables from RowHammer Attacks using Monotonic Pointers in DRAM True-Cells , 2019, ASPLOS.

[18]  Lei Ma,et al.  Experimental Evaluation of Deep Neural Network Resistance Against Fault Injection Attacks , 2019, IACR Cryptol. ePrint Arch..

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Chris Fallin,et al.  Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).