Playback Speech Detection Application Based on Cepstrum Feature

With the popularity of various portable recording devices, playback speech has become one of the most important means of attack in the speaker authentication system. By comparing with the original speech data, the difference in the high-frequency layer, and the playback speech is also different in the low-frequency layer due to the different recording equipment. According to this finding, a detection algorithm was presented to extract representative data. In the high frequency layer, the inverse-Mel filters (I-Mel) is used to extract speaker eigenvector sequences. In the low frequency layer, linear filters (Linear) is combined with Mel filters (Mel) to avoid superposition of characteristic parameters. Multi-layer fusion to obtain L-M-I filter banks to form new cepstral features. The experimental results show that the method can detect playback speech effectively and the equal error rate is 2.63%. Compared with the traditional feature extraction methods (MFCC, CQCC, LFCC, IMFCC), the equal error rate decreases by 12.79%, 9.61%, 4.45% and 3.28% respectively.

[1]  Nicholas W. D. Evans,et al.  Constant Q cepstral coefficients: A spoofing countermeasure for automatic speaker verification , 2017, Comput. Speech Lang..

[2]  Rafal Samborski,et al.  Playback attack detection for text-dependent speaker verification over telephone channels , 2015, Speech Commun..

[3]  Zhifeng Xie,et al.  ResNet and Model Fusion for Automatic Spoofing Detection , 2017, INTERSPEECH.

[4]  Zeshui Xu,et al.  Projection Models for Intuitionistic Fuzzy Multiple Attribute Decision Making , 2010, Int. J. Inf. Technol. Decis. Mak..

[5]  Syed Abdul Rahman Al-Haddad,et al.  Distant Speaker Recognition: An Overview , 2016, Int. J. Humanoid Robotics.

[6]  Eero P. Simoncelli,et al.  Summary statistics in auditory perception , 2013, Nature Neuroscience.

[7]  Wei Shang,et al.  Score normalization in playback attack detection , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Aleksandr Sizov,et al.  ASVspoof: The Automatic Speaker Verification Spoofing and Countermeasures Challenge , 2017, IEEE Journal of Selected Topics in Signal Processing.

[9]  Haizhou Li,et al.  Spoofing and countermeasures for speaker verification: A survey , 2015, Speech Commun..

[10]  Safia Nait Bahloul,et al.  A Multicriteria Clustering Approach Based on Similarity Indices and Clustering Ensemble Techniques , 2014, Int. J. Inf. Technol. Decis. Mak..

[11]  Aleksandr Sizov,et al.  ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge , 2015, INTERSPEECH.

[12]  Haizhou Li,et al.  Spoofing detection from a feature representation perspective , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Vijey Thayananthan,et al.  Analytical Techniques for Decision Making on Information Security for Big Data Breaches , 2017, Int. J. Inf. Technol. Decis. Mak..

[14]  María José Cano,et al.  Experimental Analysis of Features for Replay Attack Detection - Results on the ASVspoof 2017 Challenge , 2017, INTERSPEECH.

[15]  Jakub Galka,et al.  Audio Replay Attack Detection Using High-Frequency Features , 2017, INTERSPEECH.

[16]  Bin Ma,et al.  Speaker Verification With Feature-Space MAPLR Parameters , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Ron J. Weiss,et al.  Speech acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Ming Li,et al.  Countermeasures for Automatic Speaker Verification Replay Spoofing Attack : On Data Augmentation, Feature Representation, Classification and Fusion , 2017, INTERSPEECH.

[19]  Parav Nagarsheth,et al.  Replay Attack Detection Using DNN for Channel Discrimination , 2017, INTERSPEECH.

[20]  Suryakanth V. Gangashetty,et al.  SFF Anti-Spoofer: IIIT-H Submission for Automatic Speaker Verification Spoofing and Countermeasures Challenge 2017 , 2017, INTERSPEECH.

[21]  Nicholas W. D. Evans,et al.  A New Feature for Automatic Speaker Verification Anti-Spoofing: Constant Q Cepstral Coefficients , 2016, Odyssey.

[22]  S. R. Mahadeva Prasanna,et al.  Spoof Detection Using Source, Instantaneous Frequency and Cepstral Features , 2017, INTERSPEECH.

[23]  Madhu R. Kamble,et al.  Novel Variable Length Teager Energy Separation Based Instantaneous Frequency Features for Replay Detection , 2017, INTERSPEECH.