Exposing speech tampering via spectral phase analysis

Abstract Audio recordings serve as important evidence in law enforcement context. The most crucial problem in practical scenarios is to determine whether the audio recording is an authentic one or not. For this task, blind audio tampering detection is typically performed based on electric network frequency (ENF) artifacts. In case there is a high level of noise, ENF analysis would become invalid. In this paper, we present a novel approach to detect and locate tampering in uncompressed audio tracks by analyzing the spectral phase across the Short Time Fourier Transform (STFT) sub-bands. Spectral phase reconstruction is employed to counteract the impact of noise. Also, a new feature based on higher order statistics of the spectral phase residual and the spectral baseband phase correlation between two adjacent voiced segments is proposed to allow for an automated authentication. Experimental results show that a significant increase in detection accuracy can be achieved compared to the conventional ENF-based method when the audio recording is exposed to a high level of noise. We also testify that the proposed method remains robust under various noisy conditions.

[1]  José Antonio Apolinário,et al.  Edit Detection in Speech Recordings via Instantaneous Electric Network Frequency Variations , 2014, IEEE Transactions on Information Forensics and Security.

[2]  Min Wu,et al.  Information Forensics: An Overview of the First Decade , 2013, IEEE Access.

[3]  Jont B. Allen,et al.  Short term spectral analysis, synthesis, and modification by discrete Fourier transform , 1977 .

[4]  Xing Zhang,et al.  Detecting splicing in digital audios using local noise level estimation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Tung-Shou Chen,et al.  A New Detection Method for Tampered Audio Signals Based on Discrete Cosine Transformation , 2005, WSTST.

[6]  Rui Yang,et al.  Identifying Compression History of Wave Audio and Its Applications , 2014, TOMCCAP.

[7]  Jana Dittmann,et al.  Microphone Classification Using Fourier Coefficients , 2009, Information Hiding.

[8]  Daniel Patricio Nicolalde Rodríguez,et al.  Audio authenticity based on the discontinuity of ENF higher harmonics , 2013, 21st European Signal Processing Conference (EUSIPCO 2013).

[9]  Rafal Korycki,et al.  Authenticity examination of compressed audio recordings using detection of multiple compression and encoders' identification. , 2014, Forensic science international.

[10]  Timo Gerkmann,et al.  STFT Phase Reconstruction in Voiced Speech for an Improved Single-Channel Speech Enhancement , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[11]  Paolo Bestagini,et al.  Audio tampering detection using multimodal features , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Min Wu,et al.  Spectrum Combining for ENF Signal Estimation , 2013, IEEE Signal Processing Letters.

[13]  Daniel Patricio Nicolalde Rodríguez,et al.  Evaluating digital audio authenticity with spectral distances and ENF phase change , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Catalin Grigoras Applications of ENF criterion in forensic audio, video, computer and telecommunication analysis. , 2007, Forensic science international.

[15]  Min Wu,et al.  Anti-Forensics and Countermeasures of Electrical Network Frequency Analysis , 2013, IEEE Transactions on Information Forensics and Security.

[16]  Mike Brookes,et al.  PEFAC - A Pitch Estimation Algorithm Robust to High Levels of Noise , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[17]  K. J. Ray Liu,et al.  Robust Median Filtering Forensics Using an Autoregressive Model , 2013, IEEE Transactions on Information Forensics and Security.

[18]  Yilu Liu,et al.  An Improved Discrete Fourier Transform-Based Algorithm for Electric Network Frequency Extraction , 2013, IEEE Transactions on Information Forensics and Security.

[19]  Hany Farid,et al.  Audio forensics from acoustic reverberation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Chang-Tsun Li,et al.  Audio forensic authentication based on MOCC between ENF and reference signals , 2013, 2013 IEEE China Summit and International Conference on Signal and Information Processing.

[21]  Rui Yang,et al.  Exposing MP3 audio forgeries using frame offsets , 2012, TOMCCAP.

[22]  Marco Tagliasacchi,et al.  Blind Microphone Analysis and Stable Tone Phase Analysis for Audio Tampering Detection , 2013 .

[23]  K. J. Ray Liu,et al.  Forensic detection of image manipulation using statistical intrinsic fingerprints , 2010, IEEE Transactions on Information Forensics and Security.

[24]  Hong Zhao,et al.  Recording environment identification using acoustic reverberation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Xavier Serra,et al.  Musical Sound Modeling with Sinusoids plus Noise , 1997 .

[26]  José Antonio Apolinário,et al.  Improved edit detection in speech via ENF patterns , 2015, 2015 IEEE International Workshop on Information Forensics and Security (WIFS).

[27]  Rui Yang,et al.  Geometric Invariant Audio Watermarking Based on an LCM Feature , 2011, IEEE Transactions on Multimedia.

[28]  Shijun Xiang,et al.  Exposing digital audio forgeries in time domain by using singularity analysis with wavelets , 2013, IH&MMSec '13.

[29]  Gregory W. Wornell,et al.  Quantization index modulation: A class of provably good methods for digital watermarking and information embedding , 2001, IEEE Trans. Inf. Theory.

[30]  Johan Karlsson,et al.  ENF Extraction From Digital Recordings Using Adaptive Techniques and Frequency Tracking , 2012, IEEE Transactions on Information Forensics and Security.

[31]  Marco Fontani,et al.  Detection and localization of double compression in MP3 audio tracks , 2014, EURASIP Journal on Information Security.

[32]  Daniel Patricio Nicolalde Rodríguez,et al.  Audio Authenticity: Detecting ENF Discontinuity With High Precision Phase Analysis , 2010, IEEE Transactions on Information Forensics and Security.

[33]  Rui Yang,et al.  Detecting double compression of audio signal , 2010, Electronic Imaging.

[34]  Rui Yang,et al.  Defeating fake-quality MP3 , 2009, MM&Sec '09.

[35]  Yilu Liu,et al.  Application of Power System Frequency for Digital Audio Authentication , 2012, IEEE Transactions on Power Delivery.