A Comparison of Features for Replay Attack Detection

Speaker verification (ASV) systems are still vulnerable to different kinds of spoofing attacks, especially replay attack due to high-quality playback devices. Many countermeasures have been developed recently. Most of the efforts focus on the search for more salient features and many new features have been proposed. Five kinds of features, namely Mel-frequency cepstral coefficients (MFCCs), linear frequency cepstral coefficients (LFCCs), inverted Mel-frequency cepstral coefficients (IMFCCs), constant Q cepstral coefficients (CQCCs) and bottleneck features were compared on the public ASVspoof 2017 and BTAS 2016 datasets in this paper. Our experimental results show that MFCCs and bottleneck features yield comparable results. Both of them significantly outperform others (including the recently proposed CQCCs). However, the number of filters and cepstral bins are essential to the success of MFCCs.

[1]  Tomi Kinnunen,et al.  A comparison of features for synthetic speech detection , 2015, INTERSPEECH.

[2]  Longbiao Wang,et al.  Relative phase information for detecting human speech and spoofed speech , 2015, INTERSPEECH.

[3]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[4]  Goutam Saha,et al.  Novel speech features for improved detection of spoofing attacks , 2015, 2015 Annual IEEE India Conference (INDICON).

[5]  Zhifeng Xie,et al.  ResNet and Model Fusion for Automatic Spoofing Detection , 2017, INTERSPEECH.

[6]  Sébastien Marcel,et al.  Cross-Database Evaluation of Audio-Based Spoofing Detection Systems , 2016, INTERSPEECH.

[7]  Aleksandr Sizov,et al.  ASVspoof: The Automatic Speaker Verification Spoofing and Countermeasures Challenge , 2017, IEEE Journal of Selected Topics in Signal Processing.

[8]  Junichi Yamagishi,et al.  Synthetic Speech Discrimination using Pitch Pattern Statistics Derived from Image Analysis , 2012, INTERSPEECH.

[9]  Vidhyasaharan Sethu,et al.  Investigation of Sub-Band Discriminative Information Between Spoofed and Genuine Speech , 2016, INTERSPEECH.

[10]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[11]  Artur Janicki Spoofing countermeasure based on analysis of linear prediction error , 2015, INTERSPEECH.

[12]  Kong-Aik Lee,et al.  The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection , 2017, INTERSPEECH.

[13]  Sébastien Marcel,et al.  On the vulnerability of speaker verification to realistic voice spoofing , 2015, 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[14]  Haizhou Li,et al.  Spoofing detection from a feature representation perspective , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Goutam Saha,et al.  Generalization of spoofing countermeasures: A case study with ASVspoof 2015 and BTAS 2016 corpora , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Themos Stafylakis,et al.  Development of CRIM system for the automatic speaker verification spoofing and countermeasures challenge 2015 , 2015, INTERSPEECH.

[17]  Eduardo Lleida,et al.  Spoofing detection with DNN and one-class SVM for the ASVspoof 2015 challenge , 2015, INTERSPEECH.