Evolutionary Algorithm-based images, humanly indistinguishable and adversarial against Convolutional Neural Networks: efficiency and filter robustness

Convolutional neural networks (CNNs) have become one of the most important tools for image classification. However, many models are susceptible to adversarial attacks, and CNNs can perform misclassifications. In previous works, we successfully developed an EA-based black-box attack that creates adversarial images for the <italic>target scenario</italic> that fulfils two criteria. The CNN should classify the adversarial image in the target category with a confidence ≥ 0.95, and a human should not notice any difference between the adversarial and original images. Thanks to extensive experiments performed with the CNN <inline-formula> <tex-math notation="LaTeX">${\mathcal{C}}$ </tex-math></inline-formula> = VGG-16 trained on the CIFAR-10 dataset to classify images according to 10 categories, this paper, which substantially enhances most aspects of Chitic <italic>et al.</italic> (2021), addresses four issues. (1) From a <italic>pure</italic> EA point of view, we highlight the conceptual originality of our algorithm <inline-formula> <tex-math notation="LaTeX">$\text {EA}_{d}^{\text {target}, {\mathcal{C}}}$ </tex-math></inline-formula>, versus the classical EA approach. The competitive advantage obtained was assessed experimentally during image classification. (2) We then measured the intrinsic performance of the EA-based attack for an extensive series of ancestor images. (3) We challenged the filter resistance of the adversarial images created by the EA for five well-known filters. (4) We proceed to the creation of natively filter-resistant adversarial images that can fool humans, CNNs, and CNNs composed with filters.