Toss that BOSSbase, Alice!

Steganographic schemes for digital images are routinely designed and benchmarked based on feedback obtained on the standard image set called BOSSbase 1.01. While standardized image sets are important for advancing the field, relying on results from a single source may not provide fair benchmarking and may even lead to designs that are overoptimized and highly suboptimal on other image sources. In this paper, we investigate four modern steganographic schemes for the spatial domain, WOW, SUNIWARD, HILL, and MiPOD on two more versions of BOSSbase. We observed that with their default settings, the mutual ranking and detectability of all four embedding algorithms can dramatically change across the three image sources. For example, in a version of BOSSbase whose images were cropped instead of resized, all four schemes exhibit almost the same empirical security when steganalyzed with the spatial rich model (SRM). On the other hand, in decompressed JPEG images, WOW is the most secure embedding algorithm out of the four, and this stays true irrespectively of the JPEG quality factor when steganalyzing with both SRM and maxSRM. The empirical security of all four schemes can be increased by optimizing the parameters for each source. This is especially true for decompressed JPEGs. However, the ranking of stego schemes still varies depending on the source. Through this work, we strive to make the community aware of the fact that empirical security of steganographic algorithms is not absolute but needs to be considered within a given environment, which includes the cover source. Motivation Currently, steganographic schemes are often developed and benchmarked on standard image sources. By far the most frequently used database is BOSSbase 1.01 [1], which contains 10,000 images taken in the RAW format by seven different cameras, converted to grayscale, downsampled using the Lanczos resampling algorithm with antialiasing turned OFF, and cropped to the final size of 512×512 pixels. Many articles have been published in which this database was the sole source on which steganographers fine-tuned their embedding scheme to obtain the best possible empirical security. However, BOSSbase images are far from what many would consider natural – they are essentially grayscale thumbnails obtained by a script that only a handful of people use. Because of the rather aggressive downsizing of the original full-resolution RAW files, the content of many BOSSbase images is very complex with apparently rather weak dependencies among neighboring pixel values. The downsizing also effectively suppresses color interpolation artifacts and introduces artifacts of its own. There are images in BOSSbase that are very smooth, e.g., improperly focused images as well as images that are very dark and contain almost no content, such as an image of the Moon. One may thus argue that BOSSbase contains “enough” diversity to be used as a standardized source. On the other hand, virtually all steganographic schemes contain free parameters or design elements, such as an image transform and filter kernels, that are selected based on feedback provided by detectors on BOSSbase. We show that this makes the design overoptimized to a given image source and the embedding suboptimal on different sources. Even after optimizing the parameters of each embedding scheme to the source, universal benchmarking still does not seem possible since the optimized schemes exhibit different empirical security across sources. Additionally, the recently proposed synchronization of embedding changes [4, 12] appears far less effective on images with suppressed noise. In the next section, we explain the measure of empirical security used in this paper and how it is evaluated. We also describe three versions of BOSSbase that will be investigated, the steganographic algorithms and steganalysis feature sets, as well as the choice of the classifier. In the third section, we start with comparing the empirical security of all algorithms on all three image sources and with two different steganalysis feature sets. Then, in the fourth section we identify the key parameters of each embedding scheme and perform a grid search to find the setting that maximizes the empirical security. The fifth section is devoted to investigating the impact of synchronizing the selection channel in different sources. The paper is concluded in the last section, where we summarize the most important lessons learned. Setup of experiments Security of embedding algorithms will be evaluated experimentally by training a binary classifier for the class of cover images and a class of stego images embedded with a fixed relative payload in bits per pixel (bpp), the so-called payload-limited sender. The classifier is the FLD ensemble [10] with two feature representations – the Spatial Rich Model (SRM) [7] and its selection-channel-aware version maxSRMd2 [5]. The security is reported with PE, which is the minimal total error probability under equal priors PE = 1 2(PFA +PMD) (1) obtained on the testing set averaged over ten 50/50 splits of the image source into training and testing sets. Other measures were proposed in the past, such as the false-alarmrate for 50% correct detection of stego images [13], FA50, which is more telling about the algorithm security for low false alarms. It has been observed that for the payloadlimited sender, the detection statistic that is thresholded in the linearized version of the ensemble classifier [3] when rich models are applied is approximately Gaussian. In this case, both quantities, PE or FA50, would provide the same ranking of stego systems because there is a strictly monotone relationship between them. For the purpose of this paper, we created the following two new versions of BOSSbase 1.01: 1. BOSSbaseC (C as in Cropped) was obtained using the same script as BOSSbase 1.01 but with the resizing step skipped. The images were centrally cropped to 512×512 pixels right after they were converted from the RAW format to grayscale. Images from this source are less textured but do contain acquisition noise. 2. BOSSbaseJQF (J as in JPEG, QF is the JPEG quality factor) was formed from BOSSbase 1.01 images by JPEG compressing them with quality factor QF∈ {75,85,95} and then decompressing to the spatial domain and representing the resulting image as an 8-bit grayscale. The low-pass character of JPEG compression makes the images less textured and much less noisy. Figure 1 shows examples of four images from each source. Notice that images from BOSSbaseC appear “zoomed-in” because of the absence of downsizing. Four embedding algorithms will be investigated in this paper: Wavelet Obtained Weights (WOW) [8], the Spatial version of the UNIversal WAvelet Relative Distortion (S-UNIWARD) [9], High-Low-Low (HILL) [11], and Minimizing the power of the most POwerful Detector (MiPOD) [14], which coincides with the MultiVariate Gaussian (MVG) steganography with a Gaussian residual model [15]. The study is limited to the spatial domain and does not consider JPEG images because the source generally does not play a significant role in JPEG steganography due to the low-pass character of JPEG compression, which tends to even out the differences between various sources. Empirical security across sources The purpose of the first experiment is to show that the ranking of steganographic schemes as originally described in the corresponding papers heavily depends on the image source. Figure 2 shows PE as a function of the relative payload in bits per pixel (bpp) for the four embedding algorithms listed in the previous section on BOSSbase 1.01, (first row), BOSSbaseC (second row), and BOSSbaseJ85 (third row) with SRM (left) and maxSRMd2 (right). Note that the ranking as well as the differences between individual embedding algorithms heavily varies depends on the cover source. Most notably, in BOSSbaseJ85, the most secure algorithm is WOW while MiPOD is the least secure, which is the exact opposite in comparison with BOSSbase 1.01. Moreover, when detecting with the SRM all four embedding schemes on BOSSbaseC have nearly identical empirical security. Optimizing steganography for each source In this section, we investigate how much the empirical security of each algorithm can be improved by adjusting the embedding parameters. This gain is quantified and the optimized embedding algorithms are ranked again for each image source. We start by describing the parameters with respect to which each embedding scheme was optimized. The description is kept short but, hopefully, detailed enough for a reader familiar with the embedding algorithms to understand the parameters’ role. The reader is referred to the corresponding publications for more details. WOW: This embedding algorithm was designed to prefer making embedding changes at pixels in textured areas defined as regions with an “edge” in the horizontal, vertical, and both diagonal directions. The embedding begins with extracting directional residuals using tensor products of 8-tap Daubechies filters. Three directional filters with 8×8 kernels denoted K(h), K(v), and K(d) are used to extract three directional residuals: R(h) = K(h) ?X, R(v) = K(v) ?X, and R(d) = K(d) ?X, where ′?′ denotes a convolution and X is the matrix of pixel grayscales. In the next step, the so-called embedding suitabilities are computed: ξ(k) = |R(k)|? |K(k)|, k ∈ {h,v,d}. The embedding cost of changing pixel i, j by +1 or −1 is obtained using the reciprocal Hölder norm ρ ij = ( |ξ ij | p + |ξ ij | p + |ξ ij | p )−p with p=−1. To optimize WOW for different image sources, we search for the number of taps in Daubechies filters, p1 ∈ {2,4,8,16} and the power of the Hölder norm p2 = p. S-UNIWARD: The pixel embedding costs are obtained from a distortion function defined as the sum of relative absolute differences between wavelet coefficients of cover and stego images. Only the highest frequency band of wavelet coefficients is use

[1]  Jessica J. Fridrich,et al.  Designing steganographic distortion using directional filters , 2012, 2012 IEEE International Workshop on Information Forensics and Security (WIFS).

[2]  Jessica J. Fridrich,et al.  Content-adaptive pentary steganography using the multivariate generalized Gaussian cover model , 2015, Electronic Imaging.

[3]  Bin Li,et al.  A Strategy of Clustering Modification Directions in Spatial Image Steganography , 2015, IEEE Transactions on Information Forensics and Security.

[4]  Tomás Pevný,et al.  "Break Our Steganographic System": The Ins and Outs of Organizing BOSS , 2011, Information Hiding.

[5]  Jessica J. Fridrich,et al.  Ensemble Classifiers for Steganalysis of Digital Media , 2012, IEEE Transactions on Information Forensics and Security.

[6]  Jessica J. Fridrich,et al.  Improving Steganographic Security by Synchronizing the Selection Channel , 2015, IH&MMSec.

[7]  Jessica J. Fridrich,et al.  Content-Adaptive Steganography by Minimizing Statistical Detectability , 2016, IEEE Transactions on Information Forensics and Security.

[8]  Rainer Böhme,et al.  Advanced Statistical Steganalysis , 2010, Information Security and Cryptography.

[9]  Tomás Pevný,et al.  Towards dependable steganalysis , 2015, Electronic Imaging.

[10]  Jessica J. Fridrich,et al.  Rich Models for Steganalysis of Digital Images , 2012, IEEE Transactions on Information Forensics and Security.

[11]  Jessica J. Fridrich,et al.  Selection-channel-aware rich model for Steganalysis of digital images , 2014, 2014 IEEE International Workshop on Information Forensics and Security (WIFS).

[12]  Jessica J. Fridrich,et al.  Minimizing Additive Distortion in Steganography Using Syndrome-Trellis Codes , 2011, IEEE Transactions on Information Forensics and Security.

[13]  Bin Li,et al.  A new cost function for spatial image steganography , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[14]  Jessica Fridrich,et al.  Modeling and Extending the Ensemble Classifier for Steganalysis of Digital Images Using Hypothesis Testing Theory , 2015, IEEE Transactions on Information Forensics and Security.