Generative Adversarial Attacks Against Deep-Learning-Based Camera Model Identification

Recently, deep learning techniques have gained popularity in multimedia forensics research designed to accomplish tasks such as camera model identification. However, despite the success of deep learning techniques, research has shown that they are vulnerable to adversarial perturbations. These adversarial perturbations can cause deep learning classifiers to misclassify images even though the perturbations are imperceptible to human eyes. To understand the vulnerabilities of deep-learning-based forensic algorithms, we propose a novel anti-forensic framework inspired by generative adversarial networks that is capable of falsifying an image’s source camera model. To accomplish this, we design a generator to anti-forensically falsify camera model traces in an image without introducing visually perceptible changes or artifacts. We propose two techniques to adversarially train this generator depending on the knowledge available to the attacker. In a white-box scenario when complete knowledge of an investigator’s camera model identification network is available to an attacker, we directly incorporate the network into our generator’s adversarial training strategy. In a black-box scenario when no internal details of the camera model classifier are available to the attacker, we construct a substitute network to mimic its decisions, then utilize this substitute network to adversarially train our generator. We conduct a series of experiments to evaluate the performance of our attack against several well-known CNNbased camera model classifiers. Experimental results show that our attack can successfully fool these CNNs in both white-box and black-box scenarios. Furthermore, our attack maintains high image quality and can be generalized to attack images from arbitrary source camera models.