暂无分享,去创建一个
Thomas Hain | Yanpei Shi | Mingjie Chen | Qiang Huang | Thomas Hain | Qiang Huang | Mingjie Chen | Yanpei Shi
[1] Yong Xu,et al. Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Koichi Shinoda,et al. Attentive Statistics Pooling for Deep Speaker Embedding , 2018, INTERSPEECH.
[3] Quan Wang,et al. Attention-Based Models for Text-Dependent Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[5] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[6] Tanel Alumäe,et al. Weakly Supervised Training of Speaker Identification Models , 2018, Odyssey.
[7] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[8] Yoshifusa Ito,et al. Representation of functions by superpositions of a step or sigmoid function and their applications to neural network theory , 1991, Neural Networks.
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] S Umesh,et al. S-vectors: Speaker Embeddings based on Transformer's Encoder for Text-Independent Speaker Verification , 2020, ArXiv.
[11] Daniel Povey,et al. Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification , 2018, INTERSPEECH.
[12] V. Tiwari. MFCC and its applications in speaker recognition , 2010 .
[13] Qiang Huang,et al. Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Yuan Cao,et al. Leveraging Weakly Supervised Data to Improve End-to-end Speech-to-text Translation , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Hsiao-Chuan Wang,et al. A method of estimating the equal error rate for automatic speaker verification , 2004, 2004 International Symposium on Chinese Spoken Language Processing.
[16] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[17] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[18] Zhi-Hua Zhou,et al. A brief introduction to weakly supervised learning , 2018 .
[19] Hitoshi Yamamoto,et al. Attention Mechanism in Speaker Recognition: What Does it Learn in Deep Speaker Embedding? , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[20] Thomas Hain,et al. Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification , 2020, INTERSPEECH.
[21] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[22] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Thomas Hain,et al. H-Vectors: Utterance-Level Speaker Embedding Using a Hierarchical Attention Model , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).