Short Utterance Compensation in Speaker Verification via Cosine-Based Teacher-Student Learning of Speaker Embeddings
暂无分享,去创建一个
Hye-jin Shim | Ha-Jin Yu | Hee-Soo Heo | Jee-weon Jung | Hee-Soo Heo | Jee-weon Jung | Hye-jin Shim | Ha-jin Yu
[1] Xiong Xiao,et al. Developing Far-Field Speaker System Via Teacher-Student Learning , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[3] Xiao Liu,et al. Deep Speaker: an End-to-End Neural Speaker Embedding System , 2017, ArXiv.
[4] Jungwon Lee,et al. Bridgenets: Student-Teacher Transfer Learning Based on Recursive Neural Networks and Its Application to Distant Speech Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Yifan Gong,et al. Learning small-size DNN with output-distribution-based criteria , 2014, INTERSPEECH.
[6] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[7] Sébastien Marcel,et al. Towards Directly Modeling Raw Speech Signal for Speaker Verification Using CNNS , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Ha-Jin Yu,et al. Joint Training of Expanded End-to-End DNN for Text-Dependent Speaker Verification , 2017, INTERSPEECH.
[9] Hisashi Kawai,et al. Feature Representation of Short Utterances Based on Knowledge Distillation for Spoken Language Identification , 2018, INTERSPEECH.
[10] M. Ordin,et al. Acquisition of speech rhythm in a second language by learners with rhythmically different native languages. , 2015, The Journal of the Acoustical Society of America.
[11] Hye-jin Shim,et al. RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification , 2019, INTERSPEECH.
[12] Paavo Alku,et al. Accounting for uncertainty of i-vectors in speaker recognition using uncertainty propagation and modified imputation , 2015, INTERSPEECH.
[13] G. E. Peterson,et al. Duration of Syllable Nuclei in English , 1960 .
[14] Sridha Sridharan,et al. i-vector Based Speaker Recognition on Short Utterances , 2011, INTERSPEECH.
[15] Hitoshi Yamamoto,et al. Denoising autoencoder-based speaker feature restoration for utterances of short duration , 2015, INTERSPEECH.
[16] Ha-Jin Yu,et al. Applying compensation techniques on i-vectors extracted from short-test utterances for speaker verification using deep neural network , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Hye-jin Shim,et al. Avoiding Speaker Overfitting in End-to-End DNNs Using Raw Waveform for Text-Independent Speaker Verification , 2018, INTERSPEECH.
[18] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Yoshua Bengio,et al. Interpretable Convolutional Filters with SincNet , 2018, ArXiv.
[20] Tara N. Sainath,et al. Compression of End-to-End Models , 2018, INTERSPEECH.
[21] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[22] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[23] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[24] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Hye-jin Shim,et al. A Complete End-to-End Speaker Verification System Using Deep Neural Networks: From Raw Signals to Verification Result , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Koichi Shinoda,et al. I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification , 2018, INTERSPEECH.