RIGA: Covert and Robust White-Box Watermarking of Deep Neural Networks

Watermarking of deep neural networks (DNN) can enable their tracing once released by a data owner. In this paper, we generalize white-box watermarking algorithms for DNNs, where the data owner needs white-box access to the model to extract the watermark. White-box watermarking algorithms have the advantage that they do not impact the accuracy of the watermarked model. We propose Robust whIte-box GAn watermarking (RIGA), a novel white-box watermarking algorithm that uses adversarial training. Our extensive experiments demonstrate that the proposed watermarking algorithm not only does not impact accuracy, but also significantly improves the covertness and robustness over the current state-of-art.

[1]  Brendan Dolan-Gavitt,et al.  Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks , 2018, RAID.

[2]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Luigi V. Mancini,et al.  Evasion Attacks Against Watermarking Techniques found in MLaaS Systems , 2019, 2019 Sixth International Conference on Software Defined Systems (SDS).

[4]  Bernt Schiele,et al.  Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Simon S. Woo,et al.  Neural Network Laundering: Removing Black-Box Backdoor Watermarks from Deep Neural Networks , 2020, Comput. Secur..

[6]  Florian Kerschbaum,et al.  Attacks on Digital Watermarks for Deep Neural Networks , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Samuel Marchal,et al.  DAWN: Dynamic Adversarial Watermarking of Neural Networks , 2019, ACM Multimedia.

[8]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[9]  Benny Pinkas,et al.  Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring , 2018, USENIX Security Symposium.

[10]  Jure Leskovec,et al.  From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews , 2013, WWW.

[11]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[12]  Farinaz Koushanfar,et al.  DeepSigns: An End-to-End Watermarking Framework for Ownership Protection of Deep Neural Networks , 2019, ASPLOS.

[13]  Nikita Borisov,et al.  Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations , 2018, CCS.

[14]  Ben Y. Zhao,et al.  Piracy Resistant Watermarks for Deep Neural Networks. , 2019 .

[15]  Dawn Song,et al.  REFIT: A Unified Watermark Removal Framework For Deep Learning Systems With Limited Data , 2019, AsiaCCS.

[16]  Shanqing Guo,et al.  How to prove your model belongs to you: a blind-watermark based framework to protect intellectual property of DNN , 2019, ACSAC.

[17]  Hung Dang,et al.  Effectiveness of Distillation Attack and Countermeasure on Neural Network Watermarking , 2019, ArXiv.

[18]  Qi Li,et al.  Removing Backdoor-Based Watermarks in Neural Networks with Limited Data , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[19]  Samuel Marchal,et al.  PRADA: Protecting Against DNN Model Stealing Attacks , 2018, 2019 IEEE European Symposium on Security and Privacy (EuroS&P).

[20]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.