Accurate and Scalable Image Clustering Based on Sparse Representation of Camera Fingerprint

Clustering images according to their acquisition devices is a well-known problem in multimedia forensics, which is typically faced by means of camera sensor pattern noise (SPN). Such an issue is challenging since SPN is a noise-like signal, hard to be estimated, and easy to be attenuated or destroyed by many factors. Moreover, the high dimensionality of SPN hinders large-scale applications. Existing approaches are typically based on the correlation among SPNs in the pixel domain, which might not be able to capture intrinsic data structure in the union of vector subspaces. In this paper, we propose an accurate clustering framework, which exploits linear dependences among SPNs in their intrinsic vector subspaces. Such dependences are encoded under sparse representations, which are obtained by solving an LASSO problem with non-negativity constraint. The proposed framework is highly accurate in a number of clusters’ estimation and image association. Moreover, our framework is scalable to the number of images and robust against double JPEG compression as well as the presence of outliers, owning big potential for real-world applications. Experimental results on Dresden and Vision database show that our proposed framework can adapt well to both medium-scale and large-scale contexts and outperforms the state-of-the-art methods.

[1]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[2]  Jianjun Wang,et al.  Suppressing Random Artifacts in Reference Sensor Pattern Noise via Decorrelation , 2017, IEEE Signal Processing Letters.

[3]  Xufeng Lin,et al.  A fast source-oriented image clustering method for digital forensics , 2017, EURASIP J. Image Video Process..

[4]  Mo Chen,et al.  Determining Image Origin and Integrity Using Sensor Noise , 2008, IEEE Transactions on Information Forensics and Security.

[5]  Giulia Boato,et al.  RAISE: a raw images dataset for digital image forensics , 2015, MMSys.

[6]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[7]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Chang-Dong Wang,et al.  Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis , 2014, Neurocomputing.

[9]  Josef Eklann,et al.  Source Camera Classification and Clustering from Sensor Pattern Noise - Applied to Child Sexual Abuse Investigations , 2012 .

[10]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[11]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[12]  Omar M. Fahmy An efficient clustering technique for cameras identification using sensor pattern noise , 2015, 2015 International Conference on Systems, Signals and Image Processing (IWSSIP).

[13]  Jan Lukás,et al.  Determining digital image origin using sensor imperfections , 2005, IS&T/SPIE Electronic Imaging.

[14]  Heung-Kyu Lee,et al.  On classification of source cameras: A graph based approach , 2010, 2010 IEEE International Workshop on Information Forensics and Security.

[15]  Emmanuel J. Candès,et al.  A Geometric Analysis of Subspace Clustering with Outliers , 2011, ArXiv.

[16]  Zhang Yi,et al.  Scalable Sparse Subspace Clustering , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Mo Chen,et al.  Digital imaging sensor identification (further study) , 2007, Electronic Imaging.

[18]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[19]  Luisa Verdoliva,et al.  Correlation clustering for PRNU-based blind image source identification , 2016, 2016 IEEE International Workshop on Information Forensics and Security (WIFS).

[20]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[21]  Huan Xu,et al.  Noisy Sparse Subspace Clustering , 2013, J. Mach. Learn. Res..

[22]  Rainer Böhme,et al.  The 'Dresden Image Database' for benchmarking digital image forensics , 2010, SAC '10.

[23]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[24]  L. Javier García-Villalba,et al.  Smartphone image clustering , 2015, Expert Syst. Appl..

[25]  Greg J. Bloy Blind Camera Fingerprinting and Image Clustering , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Roberto Caldelli,et al.  Fast image clustering of unknown source images , 2010, 2010 IEEE International Workshop on Information Forensics and Security.

[27]  Chang-Tsun Li,et al.  A compact representation of sensor fingerprint for camera identification and fingerprint matching , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Zhang Yi,et al.  A Unified Framework for Representation-Based Subspace Clustering of Out-of-Sample and Large-Scale Data , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Daniel P. Robinson,et al.  Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Chang-Tsun Li,et al.  Source Camera Identification Using Enhanced Sensor Pattern Noise , 2009, IEEE Transactions on Information Forensics and Security.

[31]  W. Cheney,et al.  Proximity maps for convex sets , 1959 .

[32]  D. Donoho For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[33]  Chang-Tsun Li,et al.  Inference of a compact representation of sensor fingerprint for source camera identification , 2018, Pattern Recognit..

[34]  Andrea Marino,et al.  Blind image clustering based on the Normalized Cuts criterion for camera identification , 2014, Signal Process. Image Commun..

[35]  Allen Y. Yang,et al.  Fast L1-Minimization Algorithms For Robust Face Recognition , 2010, 1007.3753.

[36]  Luisa Verdoliva,et al.  Blind PRNU-Based Image Clustering for Source Identification , 2017, IEEE Transactions on Information Forensics and Security.

[37]  Marco Fontani,et al.  VISION: a video and image dataset for source identification , 2017, EURASIP Journal on Information Security.

[38]  M. Cugmas,et al.  On comparing partitions , 2015 .

[39]  Francesco G. B. De Natale,et al.  Image Clustering by Source Camera via Sparse Representation , 2017, MFSec@ICMR.

[40]  Emmanuel J. Candès,et al.  Robust Subspace Clustering , 2013, ArXiv.

[41]  Chang-Tsun Li Large-Scale Image Clustering Based on Camera Fingerprints , 2017, IEEE Transactions on Information Forensics and Security.

[42]  Derivation of ROCs for Composite Fingerprints and Sequential Trimming , 2010 .

[43]  Xufeng Lin,et al.  Preprocessing Reference Sensor Pattern Noise via Spectrum Equalization , 2016, IEEE Transactions on Information Forensics and Security.

[44]  Daniel P. Robinson,et al.  Provable Self-Representation Based Outlier Detection in a Union of Subspaces , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Jessica J. Fridrich,et al.  Managing a large database of camera fingerprints , 2010, Electronic Imaging.

[46]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[47]  Enrico Magli,et al.  Compressed Fingerprint Matching and Camera Identification via Random Projections , 2015, IEEE Transactions on Information Forensics and Security.

[48]  Miroslav Goljan,et al.  Digital camera identification from sensor pattern noise , 2006, IEEE Transactions on Information Forensics and Security.

[49]  Enrico Magli,et al.  Large-Scale Image Retrieval Based on Compressed Camera Identification , 2015, IEEE Transactions on Multimedia.