Spoofing Detection on the ASVspoof2015 Challenge Corpus Employing Deep Neural Networks

[1]  Yan Song,et al.  i-vector representation based on bottleneck features for language identification , 2013 .

[2]  Kuldip K. Paliwal,et al.  Product of power spectrum and group delay function for speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Tomi Kinnunen,et al.  A comparison of features for synthetic speech detection , 2015, INTERSPEECH.

[4]  Haizhou Li,et al.  Detecting Converted Speech and Natural Speech for anti-Spoofing Attack in Speaker Recognition , 2012, INTERSPEECH.

[5]  Tomi Kinnunen,et al.  A practical, self-adaptive voice activity detector for speaker verification with noisy telephone and microphone data , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Yi Liu,et al.  Simultaneous utilization of spectral magnitude and phase information to extract supervectors for speaker verification anti-spoofing , 2015, INTERSPEECH.

[7]  Sri Harish Reddy Mallidi,et al.  Neural Network Bottleneck Features for Language Identification , 2014, Odyssey.

[8]  Themos Stafylakis,et al.  Supervised/Unsupervised Voice Activity Detectors for Text-dependent Speaker Recognition on the RSR2015 Corpus , 2014, Odyssey.

[9]  Haizhou Li,et al.  Spoofing and countermeasures for speaker verification: A survey , 2015, Speech Commun..

[10]  Moncef Gabbouj,et al.  Voice Conversion Using Dynamic Kernel Partial Least Squares Regression , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Zhizheng Wu,et al.  Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2015) Database , 2014 .

[12]  Haizhou Li,et al.  Exemplar-based unit selection for voice conversion utilizing temporal information , 2013, INTERSPEECH.

[13]  Ibon Saratxaga,et al.  Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Takao Kobayashi,et al.  Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Keikichi Hirose,et al.  One-to-Many Voice Conversion Based on Tensor Representation of Speaker Space , 2011, INTERSPEECH.

[16]  Liang He,et al.  Investigation of bottleneck features and multilingual deep neural networks for speaker verification , 2015, INTERSPEECH.

[17]  Tomoki Toda,et al.  Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Hemant A. Patil,et al.  Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech , 2015, INTERSPEECH.

[19]  Haizhou Li,et al.  Synthetic speech detection using temporal modulation feature , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[21]  Douglas A. Reynolds,et al.  Deep Neural Network Approaches to Speaker and Language Recognition , 2015, IEEE Signal Processing Letters.