Improving adversarial attacks on deep neural networks via constricted gradient-based perturbations
暂无分享,去创建一个
Abstract Despite the remarkable success achieved by the deep learning techniques, adversarial attacks on deep neural networks unveiled the security issues posted in specific domains. Such carefully crafted adversarial instances generated by the adversarial strategies on L p norm bounds freely mislead the deep neural models on many professional tasks. Existing gradient-based adversarial attack methods fool the state-of-the-art classification systems into lapses and gain good adversarial effectiveness on vast professional missions. Nevertheless, we find that adversarial examples generated on gradient-based present massive pixel modifications on the generated adversarial examples. Moreover, the adversarial attack strategies based on stable gradient take the accumulation of the gradient into account. It introduces redundant perturbations with frequently occurring features in the generation of adversarial examples, yet the changes induced on the generated examples are easily detected and perceptible visually. Based on such situations, we propose types of adversarial attack approaches with constricted gradient-based strategy termed Constricted Iterative Fast Gradient Sign Method (CI-FGSM). It focuses on lessening the impact of accumulated information on the crafted perturbations via deducting the mount of preceding cumulative gradient-based entities. CI-FGSM requires few freely gradient-based operations on the generated inputs to reduce the accumulation of historical gradient-based objects, thus crafts natural and imperceptible adversarial perturbations added to the generated examples. We conduct the experiments on MNIST, CIFAR10, and IMAGENET ILSVRC2012(Val) to evaluate the performance of the proposed adversarial approaches in misleading the commonly-used deep neural classification networks. Compared to other gradient-based adversarial attack methods, experimental results reveal that CI-FGSM efficaciously reduces the extra modifications on pristine inputs and maintains perfect effect of adversarial attacks in fooling the classifiers with different norm constraint strategies.