Steganalysis into the Wild: How to Define a Source?

It is now well known that practical steganalysis using machine learning techniques can be strongly biased by the problem of Cover Source Mismatch. Such a phenomenon usually occurs in machine learning when the training and the testing sets are drawn from different sources, i.e. when they do not share the same statistical properties. In the field of steganalysis however, due to the small power of the signal targeted by steganalysis methods, it can drastically lower their performance. This paper aims to define through practical experiments what is a source in steganalysis. By assuming that two cover datasets coming from a common source should provide comparable performances in steganalysis, it is shown that the definition of a source is more related with the processing pipeline of the RAW images than with the sensor or the acquisition setup of the pictures. In order to measure the discrepancy between sources, this paper introduces the concept of consistency between sources, that quantifies how much two sources are subject to Cover Source Mismatch. We show that by adopting "training de-sign", we can increase the consistency between the training set and the testing set. To measure how much image processing operation may help the steganographers this paper also introduces the intrinsic difficulty of a source. It is observed that some processes such as JPEG quan-tization tables or the development pipeline can dramatically increase or decrease the performance of steganalysis methods and that other parameters such as the ISO sensitivity or the sensor model have minor impact on the performance.

[1]  Jessica J. Fridrich,et al.  Rich Models for Steganalysis of Digital Images , 2012, IEEE Transactions on Information Forensics and Security.

[2]  Tomás Pevný,et al.  Statistically undetectable jpeg steganography: dead ends challenges, and opportunities , 2007, MM&Sec.

[3]  Tomás Pevný,et al.  "Break Our Steganographic System": The Ins and Outs of Organizing BOSS , 2011, Information Hiding.

[4]  Tomás Pevný,et al.  Steganalysis by subtractive pixel adjacency matrix , 2010, IEEE Trans. Inf. Forensics Secur..

[5]  Jessica J. Fridrich,et al.  Study of cover source mismatch in steganalysis and ways to mitigate its impact , 2014, Electronic Imaging.

[6]  Ingemar J. Cox,et al.  A comparative study of ± steganalyzers , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[7]  Jessica J. Fridrich,et al.  Universal distortion function for steganography in an arbitrary domain , 2014, EURASIP Journal on Information Security.

[8]  Jessica J. Fridrich,et al.  Effect of Image Downsampling on Steganographic Security , 2014, IEEE Transactions on Information Forensics and Security.

[9]  Tomás Pevný,et al.  A mishmash of methods for mitigating the model mismatch mess , 2014, Electronic Imaging.

[10]  Karen O. Egiazarian,et al.  Practical Poissonian-Gaussian Noise Modeling and Fitting for Single-Image Raw-Data , 2008, IEEE Transactions on Image Processing.

[11]  Jessica J. Fridrich,et al.  Low-Complexity Features for JPEG Steganalysis Using Undecimated DCT , 2015, IEEE Transactions on Information Forensics and Security.

[12]  David A. Shamma,et al.  The New Data and New Challenges in Multimedia Research , 2015, ArXiv.

[13]  Jessica J. Fridrich,et al.  Content-Adaptive Steganography by Minimizing Statistical Detectability , 2016, IEEE Transactions on Information Forensics and Security.

[14]  Tomás Pevný,et al.  Is ensemble classifier needed for steganalysis in high-dimensional feature spaces? , 2015, 2015 IEEE International Workshop on Information Forensics and Security (WIFS).

[15]  Florent Retraint,et al.  Statistical Model of Quantized DCT Coefficients: Application in the Steganalysis of Jsteg Algorithm , 2014, IEEE Transactions on Image Processing.

[16]  Yi Zhang,et al.  Steganalysis of Adaptive JPEG Steganography Using 2D Gabor Filters , 2015, IH&MMSec.

[17]  Jessica Fridrich,et al.  Modeling and Extending the Ensemble Classifier for Steganalysis of Digital Images Using Hypothesis Testing Theory , 2015, IEEE Transactions on Information Forensics and Security.

[18]  Jessica J. Fridrich,et al.  Ensemble Classifiers for Steganalysis of Digital Media , 2012, IEEE Transactions on Information Forensics and Security.

[19]  Rainer Böhme,et al.  Moving steganography and steganalysis from the laboratory into the real world , 2013, IH&MMSec '13.