Pixel-domain adversarial examples against CNN-based manipulation detectors

An attack method against convolutional neural network (CNN) detectors, which minimises the distortion in the pixel domain, is proposed. By focusing on CNN models developed for manipulation detection, experiments show that, while the small perturbations introduced by existing methods tend to be cancelled out when the adversarial examples are rounded to pixels, thus making the attack ineffective, the proposed approach can generate pixel-domain adversarial images which succeed in inducing a wrong decision with very small distortions.