Recording Source Identification Using Device Universal Background Model

The task of recording source identification is to extract digital evidence of device mechanism involved in the generation of the speech signals by analyzing the acoustic signals. This paper proposes a Device Universal Background Model (DEV-UBM) based algorithm for identifying recording devices. When extracting the features of recording devices, the mute voice is extracted from the input speech, and then MFCCs extracted from the mute voice after pre-processing are as the features of recording devices. DEV-UBM is used to represent the model of recording device, and log-likelihood is used to describe the final scores, which are used to classify different recording devices. The experimental result indicates that the mean accuracy of recording device identification on 9 recording devices is 93.16%, which shows the proposed algorithm is effective.

[1]  Zhihong Tian,et al.  A transductive scheme based inference techniques for network forensic analysis , 2015 .

[2]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[3]  Jana Dittmann,et al.  Digital audio forensics: a first practical evaluation on microphone and environment classification , 2007, MM&Sec.

[4]  K. J. Ray Liu,et al.  Compressive Sensing Forensics , 2015, IEEE Transactions on Information Forensics and Security.

[5]  Philipos C. Loizou,et al.  Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Georges Quénot,et al.  Unsupervised Speaker Identification in TV Broadcast Based on Written Names , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[7]  Aravind K. Mikkilineni,et al.  Printer and scanner forensics , 2009, IEEE Signal Processing Magazine.

[8]  Gang Wei,et al.  Channel pattern noise based playback attack detection algorithm for speaker recognition , 2011, 2011 International Conference on Machine Learning and Cybernetics.

[9]  Shrikanth S. Narayanan,et al.  Rapid Language Identification , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10]  Luke Dosiek Extracting Electrical Network Frequency From Digital Recordings Using Frequency Demodulation , 2015, IEEE Signal Processing Letters.

[11]  Aaron E. Rosenberg,et al.  An improved endpoint detector for isolated word recognition , 1981 .

[12]  Aleksandr Sizov,et al.  Joint Speaker Verification and Antispoofing in the $i$ -Vector Space , 2015, IEEE Transactions on Information Forensics and Security.

[13]  Mohammad Ali Akhaee,et al.  A Source-Channel Coding Approach to Digital Image Protection and Self-Recovery , 2015, IEEE Transactions on Image Processing.

[14]  Nasir D. Memon,et al.  Sensor Fingerprint Identification Through Composite Fingerprints and Group Testing , 2015, IEEE Transactions on Information Forensics and Security.

[15]  Yoichi Tomioka,et al.  Robust Digital Camera Identification Based on Pairwise Magnitude Relations of Clustered Sensor Pattern Noise , 2013, IEEE Transactions on Information Forensics and Security.

[16]  Andreas Stolcke,et al.  Speaker Recognition With Session Variability Normalization Based on MLLR Adaptation Transforms , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Florent Retraint,et al.  Camera Model Identification Based on the Heteroscedastic Noise Model , 2014, IEEE Transactions on Image Processing.