SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization

We present a novel framework, Spatial Pyramid Attention Network (SPAN) for detection and localization of multiple types of image manipulations. The proposed architecture efficiently and effectively models the relationship between image patches at multiple scales by constructing a pyramid of local self-attention blocks. The design includes a novel position projection to encode the spatial positions of the patches. SPAN is trained on a generic, synthetic dataset but can also be fine tuned for specific datasets; The proposed method shows significant gains in performance on standard datasets over previous state-of-the-art methods.

[1]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[2]  Jian Yang,et al.  Selective Kernel Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Wael Abd-Almageed,et al.  BusterNet: Detecting Copy-Move Image Forgery with Source/Target Localization , 2018, ECCV.

[4]  Larry S. Davis,et al.  Two-Stream Neural Networks for Tampered Face Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Wael Abd-Almageed,et al.  Deep Matching and Validation Network: An End-to-End Solution to Constrained Image Splicing Localization and Detection , 2017, ACM Multimedia.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Belhassen Bayar,et al.  A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer , 2016, IH&MMSec.

[8]  Wael Abd-Almageed,et al.  Image Copy-Move Forgery Detection via an End-to-End Deep Neural Network , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[9]  Premkumar Natarajan,et al.  ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Amit K. Roy-Chowdhury,et al.  Hybrid LSTM and Encoder–Decoder Architecture for Detection of Image Forgeries , 2019, IEEE Transactions on Image Processing.

[11]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[12]  Alexander J. Smola,et al.  Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Larry S. Davis,et al.  Learning Rich Features for Image Manipulation Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  C.-C. Jay Kuo,et al.  Image Splicing Localization using a Multi-task Fully Convolutional Network (MFCN) , 2017, J. Vis. Commun. Image Represent..

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Stefan Winkler,et al.  COVERAGE — A novel database for copy-move forgery detection , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[18]  Andrew Owens,et al.  Fighting Fake News: Image Splice Detection via Learned Self-Consistency , 2018, ECCV.

[19]  Jing Dong,et al.  CASIA Image Tampering Detection Evaluation Database , 2013, 2013 IEEE China Summit and International Conference on Signal and Information Processing.

[20]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[21]  B. S. Manjunath,et al.  Exploiting Spatial Structure for Localizing Manipulated Image Regions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jessica J. Fridrich,et al.  Rich Models for Steganalysis of Digital Images , 2012, IEEE Transactions on Information Forensics and Security.

[25]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[27]  Xianfeng Zhao,et al.  A deep learning approach to patch-based image inpainting forensics , 2018, Signal Process. Image Commun..

[28]  Fabio Remondino,et al.  The Point Where Reality Meets Fantasy: Mixed Adversarial Generators for Image Splice Detection , 2019, NeurIPS.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Alessandro Piva,et al.  Image Forgery Localization via Fine-Grained Analysis of CFA Artifacts , 2012, IEEE Transactions on Information Forensics and Security.

[31]  Rainer Böhme,et al.  The 'Dresden Image Database' for benchmarking digital image forensics , 2010, SAC '10.

[32]  Davide Cozzolino,et al.  Splicebuster: A new blind image splicing detector , 2015, 2015 IEEE International Workshop on Information Forensics and Security (WIFS).

[33]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[34]  Koray Kavukcuoglu,et al.  Visual Attention , 2020, Computational Models for Cognitive Vision.

[35]  Davide Cozzolino,et al.  Efficient Dense-Field Copy–Move Forgery Detection , 2015, IEEE Transactions on Information Forensics and Security.

[36]  Jiangqun Ni,et al.  A deep learning approach to detection of splicing and copy-move forgeries in images , 2016, 2016 IEEE International Workshop on Information Forensics and Security (WIFS).

[37]  Dustin Tran,et al.  Image Transformer , 2018, ICML.

[38]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Belhassen Bayar,et al.  Constrained Convolutional Neural Networks: A New Approach Towards General Purpose Image Manipulation Detection , 2018, IEEE Transactions on Information Forensics and Security.

[40]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[41]  Babak Mahdian,et al.  Using noise inconsistencies for blind image forensics , 2009, Image Vis. Comput..