Audio Steganalysis with Convolutional Neural Network

In recent years, deep learning has achieved breakthrough results in various areas, such as computer vision, audio recognition, and natural language processing. However, just several related works have been investigated for digital multimedia forensics and steganalysis. In this paper, we design a novel CNN (convolutional neural networks) to detect audio steganography in the time domain. Unlike most existing CNN based methods which try to capture media contents, we carefully design the network layers to suppress audio content and adaptively capture the minor modifications introduced by ±1 LSB based steganography. Besides, we use a mix of convolutional layer and max pooling to perform subsampling to achieve good abstraction and prevent over-fitting. In our experiments, we compared our network with six similar network architectures and two traditional methods using handcrafted features. Extensive experimental results evaluated on 40,000 speech audio clips have shown the effectiveness of the proposed convolutional network.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[4]  Yun Q. Shi,et al.  Ensemble of CNNs for Steganalysis: An Empirical Study , 2016, IH&MMSec.

[5]  Ming Tang,et al.  AMR Steganalysis Based on the Probability of Same Pulse Position , 2015, IEEE Transactions on Information Forensics and Security.

[6]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[7]  Siwei Lyu,et al.  Steganalysis of recorded speech , 2005, IS&T/SPIE Electronic Imaging.

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Qingzhong Liu,et al.  Derivative-based audio steganalysis , 2011, TOMCCAP.

[10]  Yun Q. Shi,et al.  Structural Design of Convolutional Neural Networks for Steganalysis , 2016, IEEE Signal Processing Letters.

[11]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[13]  Patrick Aichroth,et al.  AAC encoding detection and bitrate estimation using a convolutional neural network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Qingzhong Liu,et al.  Temporal Derivative-Based Spectrum and Mel-Cepstrum Audio Steganalysis , 2009, IEEE Transactions on Information Forensics and Security.

[15]  Qingzhong Liu,et al.  MP3 audio steganalysis , 2013, Inf. Sci..

[16]  Bin Li,et al.  Detection of Double Compressed AMR Audio Using Stacked Autoencoder , 2017, IEEE Transactions on Information Forensics and Security.

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Belhassen Bayar,et al.  A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer , 2016, IH&MMSec.

[19]  Jing Dong,et al.  Deep learning for steganalysis via convolutional neural networks , 2015, Electronic Imaging.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[22]  Jana Dittmann,et al.  Mel-cepstrum-based steganalysis for VoIP steganography , 2007, Electronic Imaging.

[23]  Z. Jane Wang,et al.  Median Filtering Forensics Based on Convolutional Neural Networks , 2015, IEEE Signal Processing Letters.