Spoofing Attack Detection using the Non-linear Fusion of Sub-band Classifiers

The threat of spoofing can pose a risk to the reliability of automatic speaker verification. Results from the bi-annual ASVspoof evaluations show that effective countermeasures demand front-ends designed specifically for the detection of spoofing artefacts. Given the diversity in spoofing attacks, ensemble methods are particularly effective. The work in this paper shows that a bank of very simple classifiers, each with a front-end tuned to the detection of different spoofing attacks and combined at the score level through non-linear fusion, can deliver superior performance than more sophisticated ensemble solutions that rely upon complex neural network architectures. Our comparatively simple approach outperforms all but 2 of the 48 systems submitted to the logical access condition of the most recent ASVspoof 2019 challenge.

[1]  Rangding Wang,et al.  A Replay Speech Detection Algorithm Based on Sub-band Analysis , 2018, Intelligent Information Processing.

[2]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[3]  Tomi Kinnunen,et al.  A comparison of features for synthetic speech detection , 2015, INTERSPEECH.

[4]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[5]  Tomi Kinnunen,et al.  ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection , 2019, INTERSPEECH.

[6]  Nicholas W. D. Evans,et al.  A New Feature for Automatic Speaker Verification Anti-Spoofing: Constant Q Cepstral Coefficients , 2016, Odyssey.

[7]  Vidhyasaharan Sethu,et al.  Investigation of Sub-Band Discriminative Information Between Spoofed and Genuine Speech , 2016, INTERSPEECH.

[8]  Hemlata Tak,et al.  An explainability study of the constant Q cepstral coefficient spoofing countermeasure for automatic speaker verification , 2020, Odyssey.

[9]  Tiago H. Falk,et al.  An Ensemble Based Approach for Generalized Detection of Spoofing Attacks to Automatic Speaker Recognizers , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Junichi Yamagishi,et al.  ASVspoof 2021: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan , 2021, ArXiv.

[11]  Vivek Kanhangad,et al.  Subband Analysis for Performance Improvement of Replay Attack Detection in Speaker Verification Systems , 2019, 2019 IEEE 5th International Conference on Identity, Security, and Behavior Analysis (ISBA).

[12]  Parav Nagarsheth,et al.  Replay Attack Detection Using DNN for Channel Discrimination , 2017, INTERSPEECH.

[13]  Galina Lavrentyeva,et al.  STC Antispoofing Systems for the ASVspoof2019 Challenge , 2019, INTERSPEECH.

[14]  Dan Wu,et al.  Ensemble Learning for Countermeasure of Audio Replay Spoofing Attack in ASVspoof2017 , 2017, INTERSPEECH.

[15]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[16]  Hye-jin Shim,et al.  Replay attack detection with complementary high-resolution information using end-to-end DNN for the ASVspoof 2019 Challenge , 2019, INTERSPEECH.

[17]  Kong-Aik Lee,et al.  Integrated Presentation Attack Detection and Automatic Speaker Verification: Common Features and Gaussian Back-end Fusion , 2018, INTERSPEECH.

[18]  Haizhou Li,et al.  Significance of Subband Features for Synthetic Speech Detection , 2020, IEEE Transactions on Information Forensics and Security.

[19]  Niko Brümmer,et al.  The BOSARIS Toolkit: Theory, Algorithms and Code for Surviving the New DCF , 2013, ArXiv.

[20]  Lauri Juvela,et al.  The ASVspoof 2019 database , 2019, ArXiv.

[21]  Douglas A. Reynolds,et al.  Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22]  Lauri Juvela,et al.  ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech , 2019, Comput. Speech Lang..

[23]  Emmanouil Benetos,et al.  Subband modeling for spoofing detection in automatic speaker verification , 2020 .

[24]  Bob L. Sturm,et al.  Ensemble Models for Spoofing Detection in Automatic Speaker Verification , 2019, INTERSPEECH.

[25]  Jakub Galka,et al.  Audio Replay Attack Detection Using High-Frequency Features , 2017, INTERSPEECH.

[26]  R. Leighton,et al.  Feynman Lectures on Physics , 1971 .

[27]  Nicholas W. D. Evans,et al.  Constant Q cepstral coefficients: A spoofing countermeasure for automatic speaker verification , 2017, Comput. Speech Lang..

[28]  Anssi Klapuri,et al.  A Matlab Toolbox for Efficient Perfect Reconstruction Time-Frequency Transforms with Log-Frequency Resolution , 2014, Semantic Audio.

[29]  Kong-Aik Lee,et al.  t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification , 2018, Odyssey.