论文信息 - Speaker-independent source cell-phone identification for re-compressed and noisy audio recordings

Speaker-independent source cell-phone identification for re-compressed and noisy audio recordings

With the rapid increase in user-generated multimedia content, extensive outreach over social media, and their potential in critical applications such as law enforcement, sourcey identification from re-compressed and noisy multimedia are of great importance. This paper proposes a system for speaker-independent cell-phone identification from recorded audio. This system is capable of dealing with test audio with different speech content and a different speaker compared to the training audio. Each recorded audio has the device fingerprint implicitly embedded in it, which encourages us to design a CNN-based system for learning the device-specific signatures directly from the magnitude of discrete Fourier transform of the audio. This paper also addresses the scenario where the recorded audio is re-compressed due to efficient storage and network transmission requirements, which is a common phenomenon in this age of social media. The scenario of the cell-phone classification from the audio recordings in the presence of additive white Gaussian noise is addressed as well. We show that our proposed system performs as well as the state-of-art systems for the speaker-dependent case with clean audio recordings and exhibits much higher robustness in the speaker-independent case with clean, re-compressed, and noisy audio recordings.

Vinay Verma | Nitin Khanna | N. Khanna | Vinay Verma

[1] Rangding Wang,et al. Source Cell-Phone Identification in the Presence of Additive Noise from CQT Domain , 2018, Inf..

[2] Tomi Kinnunen,et al. Source cell-phone recognition from recorded speech using non-speech segments , 2014, Digit. Signal Process..

[3] Junfeng Wu,et al. Source cell phone verification from speech recordings using sparse representation , 2017, Digit. Signal Process..

[4] Ömer Eskidere,et al. Source microphone identification from speech recordings based on a Gaussian mixture model , 2014 .

[5] Muhammad Khurram Khan,et al. Digital multimedia audio forensics: past, present and future , 2017, Multimedia Tools and Applications.

[6] Min Wu,et al. Information Forensics: An Overview of the First Decade , 2013, IEEE Access.

[7] Ömer Eskidere,et al. Identifying acquisition devices from recorded speech signals using wavelet-based features , 2016 .

[8] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[9] Yuhan Zhang,et al. Mobile Phone Clustering From Speech Recordings Using Deep Representation and Spectral Clustering , 2018, IEEE Transactions on Information Forensics and Security.

[10] Bin Li,et al. Detection of Double Compressed AMR Audio Using Stacked Autoencoder , 2017, IEEE Transactions on Information Forensics and Security.

[11] Gianmarco Baldini,et al. Smartphones Identification Through the Built-In Microphones With Convolutional Neural Network , 2019, IEEE Access.

[12] Cemal Hanilçi,et al. Recognition of Brand and Models of Cell-Phones From Recorded Speech Signals , 2012, IEEE Transactions on Information Forensics and Security.

[13] F.H.F. Leung,et al. Source Microphone Recognition Aided by a Kernel-Based Projection Method , 2019, IEEE Transactions on Information Forensics and Security.

[14] Jiwu Huang,et al. Band Energy Difference for Source Attribution in Audio Forensics , 2018, IEEE Transactions on Information Forensics and Security.