Acoustic Environment Identification and Its Applications to Audio Forensics

An audio recording is subject to a number of possible distortions and artifacts. Consider, for example, artifacts due to acoustic reverberation and background noise. The acoustic reverberation depends on the shape and the composition of a room, and it causes temporal and spectral smearing of the recorded sound. The background noise, on the other hand, depends on the secondary audio source activities present in the evidentiary recording. Extraction of acoustic cues from an audio recording is an important but challenging task. Temporal changes in the estimated reverberation and background noise can be used for dynamic acoustic environment identification (AEI), audio forensics, and ballistic settings. We describe a statistical technique to model and estimate the amount of reverberation and background noise variance in an audio recording. An energy-based voice activity detection method is proposed for automatic decaying-tail-selection from an audio recording. Effectiveness of the proposed method is tested using a data set consisting of speech recordings. The performance of the proposed method is also evaluated for both speaker-dependent and speaker-independent scenarios.

[1]  Hong Zhao,et al.  Audio forensics using acoustic environment traces , 2012, 2012 IEEE Statistical Signal Processing Workshop (SSP).

[2]  Daniel Garcia-Romero,et al.  Speech forensics: Automatic acquisition device identification. , 2010 .

[3]  John H. L. Hansen,et al.  VOICE ANALYSIS IN ADVERSE CONDITIONS: THE CENTENNIAL OLYMPIC PARK BOMBING 911 CALL , 1999 .

[4]  Philipos C. Loizou,et al.  A multi-band spectral subtraction method for enhancing speech corrupted by colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Hong Zhao,et al.  Recording environment identification using acoustic reverberation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Hong Zhao,et al.  Audio Recording Location Identification Using Acoustic Environment Signature , 2013, IEEE Transactions on Information Forensics and Security.

[7]  Hafiz Malik,et al.  Digital audio forensics using background noise , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[8]  Dagmar Boss Visualization of Magnetic Features on Analogue Audiotapes Is Still an Important Task , 2010 .

[9]  R. Maher,et al.  Audio forensic examination , 2009, IEEE Signal Processing Magazine.

[10]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[11]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[12]  Joerg Bitzer,et al.  Speech Enhancement by Adaptive Noise Cancellation: Problems, Algorithms, and Limits , 2010 .

[13]  E. Lehmann,et al.  Prediction of energy decay in room impulse responses simulated with an image-source model. , 2008, The Journal of the Acoustical Society of America.

[14]  Xing Zhang,et al.  Detecting splicing in digital audios using local noise level estimation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Hafiz Malik Securing Speaker Verification System Against Replay Attack , 2012 .

[16]  Harry Hollien,et al.  Forensic Voice Identification , 2001 .

[17]  Catalin Grigoras Applications of ENF criterion in forensic audio, video, computer and telecommunication analysis. , 2007, Forensic science international.

[18]  Catalin Grigoras Statistical Tools for Multimedia Forensics , 2010 .

[19]  Alan J. Cooper The Electric Network Frequency (ENF) as an Aid to Authenticating Forensic Digital Audio Recordings – an Automated Approach , 2008 .

[20]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[21]  Hany Farid,et al.  Audio forensics from acoustic reverberation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Alan J. Cooper Detecting Butt-Spliced Edits in Forensic Digital Audio Recordings , 2010 .

[23]  Rui Yang,et al.  Defeating fake-quality MP3 , 2009, MM&Sec '09.

[24]  Daniel Garcia-Romero,et al.  Automatic acquisition device identification from speech recordings , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25]  Yang Lu,et al.  A geometric approach to spectral subtraction , 2008, Speech Commun..

[26]  John Mourjopoulos,et al.  Speech enhancement based on audible noise suppression , 1997, IEEE Trans. Speech Audio Process..

[27]  Daniel Patricio Nicolalde Rodríguez,et al.  Evaluating digital audio authenticity with spectral distances and ENF phase change , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[28]  Ulrich Hatje,et al.  Frequency-Domain Processors for Efficient Removal of Noise and Unwanted Audio Events , 2005 .

[29]  Durand R. Begault,et al.  Tape Analysis and Authentication using Multi-Track Recorders , 2005 .

[30]  Mark Kahrs,et al.  Applications of digital signal processing to audio and acoustics , 1998 .

[31]  Jana Dittmann,et al.  Unweighted fusion in microphone forensics using a decision tree and linear logistic regression models , 2009, MM&Sec '09.

[32]  R.C. Maher,et al.  Modeling and Signal Processing of Acoustic Gunshot Recordings , 2006, 2006 IEEE 12th Digital Signal Processing Workshop & 4th IEEE Signal Processing Education Workshop.

[33]  Hafiz Malik,et al.  Microphone Identification Using Higher-Order Statistics , 2012 .

[34]  Catalin Grigoras Digital audio recording analysis: the Electric Network Frequency (ENF) Criterion , 2005 .

[35]  M. Schroeder New Method of Measuring Reverberation Time , 1965 .

[36]  Jana Dittmann,et al.  Microphone Classification Using Fourier Coefficients , 2009, Information Hiding.

[37]  Eddy B. Brixen Acoustics of the Crime Scene as Transmitted by Mobile Phones , 2009 .

[38]  Eddy B. Brixen ENF; Quantification of the Magnetic Field , 2008 .

[39]  Jana Dittmann,et al.  Digital audio forensics: a first practical evaluation on microphone and environment classification , 2007, MM&Sec.

[40]  Douglas L. Jones,et al.  Blind estimation of reverberation time. , 2003, The Journal of the Acoustical Society of America.

[41]  Robert C. Maher,et al.  Acoustical Characterization of Gunshots , 2007 .

[42]  Harry Hollien,et al.  The Acoustics of Crime: The New Science of Forensic Phonetics , 1990 .

[43]  Robert C. Maher Audio Enancement using Nonlinear Time-Frequency Filtering , 2005 .

[44]  Bruce E. Koenig,et al.  Forensic Enhancement of Digital Audio Recordings , 2007 .

[45]  Daniel Patricio Nicolalde Rodríguez,et al.  Audio Authenticity: Detecting ENF Discontinuity With High Precision Phase Analysis , 2010, IEEE Transactions on Information Forensics and Security.

[46]  Rui Yang,et al.  Detecting double compression of audio signal , 2010, Electronic Imaging.

[47]  A.V. Oppenheim,et al.  Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.