A new paradigm for steganalysis via clustering

We propose a new paradigm for blind, universal, steganalysis in the case when multiple actors transmit multiple objects, with guilty actors including some stego objects in their transmissions. The method is based on clustering rather than classification, and it is the actors which are clustered rather than their individual transmitted objects. This removes the need for training a classifier, and the danger of training model mismatch. It effectively judges the behaviour of actors by assuming that most of them are innocent: after performing agglomerative hierarchical clustering, the guilty actor(s) are clustered separately from the innocent majority. A case study shows that this works in the case of JPEG images. Although it is less sensitive than steganalysis based on specifically-trained classifiers, it requires no training, no knowledge of the embedding algorithm, and attacks the pooled steganalysis problem.

[1]  Sangjin Lee,et al.  Category Attack for LSB Steganalysis of JPEG Images , 2006, IWDW.

[2]  Tomás Pevný,et al.  Steganalysis by subtractive pixel adjacency matrix , 2010, IEEE Trans. Inf. Forensics Secur..

[3]  Jessica J. Fridrich,et al.  Higher-order statistical steganalysis of palette images , 2003, IS&T/SPIE Electronic Imaging.

[4]  Tomás Pevný,et al.  Multiclass Detector of Current Steganographic Methods for JPEG Format , 2008, IEEE Transactions on Information Forensics and Security.

[5]  Tomás Pevný,et al.  The square root law of steganographic capacity , 2008, MM&Sec '08.

[6]  L. Mcquitty Similarity Analysis by Reciprocal Pairs for Discrete and Continuous Data , 1966 .

[7]  Keith J. Jones,et al.  10th USENIX Security Symposium , 2001, login Usenix Mag..

[8]  Jessica J. Fridrich,et al.  Reliable detection of LSB steganography in color and grayscale images , 2001, MM&Sec '01.

[9]  Andrew D. Ker A Capacity Result for Batch Steganography , 2007, IEEE Signal Processing Letters.

[10]  Tomás Pevný,et al.  Benchmarking for Steganography , 2008, Information Hiding.

[11]  Jessica J. Fridrich,et al.  Quantitative steganalysis of LSB embedding in JPEG domain , 2010, MM&Sec '10.

[12]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[13]  Siwei Lyu,et al.  Detecting Hidden Messages Using Higher-Order Statistics and Support Vector Machines , 2002, Information Hiding.

[14]  Trevor F. Cox,et al.  Metric multidimensional scaling , 2000 .

[15]  Andrew D. Ker A Fusion of Maximum Likelihood and Structural Steganalysis , 2007, Information Hiding.

[16]  Andrew D. Ker Batch Steganography and Pooled Steganalysis , 2006, Information Hiding.

[17]  Tomás Pevný,et al.  Novelty detection in blind steganalysis , 2008, MM&Sec '08.

[18]  Gwenaël J. Doërr,et al.  Evaluation of an optimal watermark tampering attack against dirty paper trellis schemes , 2008, MM&Sec '08.

[19]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[20]  Tomás Pevný,et al.  Statistically undetectable jpeg steganography: dead ends challenges, and opportunities , 2007, MM&Sec.

[21]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[22]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.