Recurrent Convolutional Neural Networks for AMR Steganalysis Based on Pulse Position

With the rapid development of stream multimedia, the adaptive multi-rate (AMR) audio steganography are emerging recently. However, the traditional steganalysis methods face great challenges in detecting short time speech at low embedding rates. To address this problem, we propose a steganalytic scheme by combining Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), SRCNet. AMR fixed codebook (FCB) steganography embed messages by modifying the pulse positions, which would destroy the FCB correlation. Firstly we analyzed the FCB correlations at different distances, and summarized these correlations into four categories. Furthermore, we utilizes RNN to extract higher level contextual representations of FCBs and CNN to fuse spatial-temporal features for the steganalysis. The proposed approach was evaluated on a public data-set. The experiment results validate that the proposed framework greatly outperforms the existing state-of-the-art methods. The correct detection rate of SRCNet has been improved above at least 10% when the sample is as short as 100ms at the 20% embedding rate. In particular, the network achieves the significant improvements for detecting the STCs based adaptive AMR steganography.

[1]  Xianfeng Zhao,et al.  AHCM: Adaptive Huffman Code Mapping for Audio Steganography Based on Psychoacoustic Model , 2019, IEEE Transactions on Information Forensics and Security.

[2]  Ke Zhou,et al.  Least-significant-digit steganography in low bitrate speech , 2012, 2012 IEEE International Conference on Communications (ICC).

[3]  Kun Yang,et al.  CNN-based Steganalysis of MP3 Steganography in the Entropy Code Domain , 2018, IH&MMSec.

[4]  Rong-San Lin An Imperceptible Information Hiding in Encoded Bits of Speech Signal , 2015, 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP).

[5]  Zhili Chen,et al.  A new scheme for covert communication via 3G encoded speech , 2012, Comput. Electr. Eng..

[6]  Xijian Ping,et al.  Steganalysis of Analysis-by-Synthesis Compressed Speech , 2010, 2010 International Conference on Multimedia Information Networking and Security.

[7]  Shanyu Tang,et al.  An Approach to Information Hiding in Low Bit-Rate Speech Stream , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[8]  Jing Yang,et al.  An AMR adaptive steganographic scheme based on the pitch delay of unvoiced speech , 2018, Multimedia Tools and Applications.

[9]  Peter Vary,et al.  High rate data hiding in ACELP speech codecs , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Ming Tang,et al.  AMR Steganalysis Based on the Probability of Same Pulse Position , 2015, IEEE Transactions on Information Forensics and Security.

[11]  Zhijun Wu,et al.  An approach of steganography in G.729 bitstream based on matrix coding and interleaving , 2015 .

[12]  Yongfeng Huang,et al.  RNN-SM: Fast Steganalysis of VoIP Streams Using Recurrent Neural Network , 2018, IEEE Transactions on Information Forensics and Security.

[13]  Liusheng Huang,et al.  Steganalysis of Compressed Speech Based on Markov and Entropy , 2013, IWDW.

[14]  Peng Liu,et al.  Steganography integrated into linear predictive coding for low bit-rate speech codec , 2016, Multimedia Tools and Applications.

[15]  Xijian Ping,et al.  Steganalysis of Compressed Speech Based on Histogram Features , 2010, 2010 6th International Conference on Wireless Communications Networking and Mobile Computing (WiCOM).

[16]  Xianfeng Zhao,et al.  Pitch Delay Based Adaptive Steganography for AMR Speech Stream , 2018, IWDW.

[17]  Jessica J. Fridrich,et al.  Minimizing Additive Distortion in Steganography Using Syndrome-Trellis Codes , 2011, IEEE Transactions on Information Forensics and Security.

[18]  Sen Bai,et al.  Steganography Integration Into a Low-Bit Rate Speech Codec , 2012, IEEE Transactions on Information Forensics and Security.

[19]  Bolin Chen,et al.  Audio Steganalysis with Convolutional Neural Network , 2017, IH&MMSec.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Akira Nishimura Data Hiding in Pitch Delay Data of the Adaptive Multi-Rate Narrow-band Speech Codec , 2009, 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[22]  Hui Tian,et al.  Steganalysis of adaptive multi-rate speech using statistical characteristics of pulse pairs , 2017, Signal Process..

[23]  Lina Wang,et al.  AMR Steganalysis Based on Second-Order Difference of Pitch Delay , 2017, IEEE Transactions on Information Forensics and Security.

[24]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[25]  Lina Wang,et al.  An AMR adaptive steganography algorithm based on minimizing distortion , 2017, Multimedia Tools and Applications.

[26]  Roch Lefebvre,et al.  The adaptive multirate wideband speech codec (AMR-WB) , 2002, IEEE Trans. Speech Audio Process..

[27]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[28]  Magnus Westerlund,et al.  Real-Time Transport Protocol (RTP) Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs , 2002, RFC.