Identification of Electronic Disguised Voices in the Noisy Environment

Since voice disguise has been an increasing tendency in illegal application, which brings great negative impact on establishing the authenticity of audio evidence for audio forensics, especially in noisy environment. Thus it is very important to have the capability of identifying whether a suspected voice has been disguised or not. However, few studies about identification in noisy environment have been reported. In this paper, an algorithm based on linear frequency cepstrum coefficients (LFCC) statistical moments and Formant statistical moments is proposed to identify such condition. First, LFCC statistical moments including mean values and variance unbiased estimation values, and Formant statistical moments including mean values are extracted as acoustic features, and then Support vector machine (SVM) classifiers are used to separate disguised voices from original voices. Experimental results verify the excellent performance of the proposed scheme in the noisy environment.

[1]  Yong Wang,et al.  Identification of Electronic Disguised Voices , 2014, IEEE Transactions on Information Forensics and Security.

[2]  Bin Gu,et al.  A Robust Regularization Path Algorithm for $\nu $ -Support Vector Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Andreas Stolcke,et al.  SRI's 2004 NIST speaker recognition evaluation system , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Xu Yong-hua Speech detection algorithm based on energy-entropy , 2005 .

[5]  Yong Wang,et al.  Blind detection of electronic disguised voice , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Gérard Chollet,et al.  The question of disguised voice , 2008 .

[7]  Gérard Chollet,et al.  Voice Disguise and Automatic Detection: Review and Perspectives , 2005, WNSP.

[8]  Annabel J. Cohen,et al.  Development of the perception of musical relations: semitone and diatonic structure. , 1986 .

[9]  Bin Gu,et al.  Incremental Support Vector Learning for Ordinal Regression , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Tiejun Tan,et al.  The effect of voice disguise on Automatic Speaker Recognition , 2010, 2010 3rd International Congress on Image and Signal Processing.