A Novel Learnable Dictionary Encoding Layer for End-to-End Language Identification
暂无分享,去创建一个
Ming Li | Xiang Zhang | Weicheng Cai | Xiaoqi Wang | Zexin Cai | Ming Li | Zexin Cai | Weicheng Cai | Xiangjinzi Zhang | Xiaoqi Wang
[1] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Andrew Zisserman,et al. Deep Fisher Networks for Large-Scale Image Classification , 2013, NIPS.
[3] Daniel Garcia-Romero,et al. Analysis of i-vector Length Normalization in Speaker Recognition Systems , 2011, INTERSPEECH.
[4] Kristin J. Dana,et al. Deep TEN: Texture Encoding Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Yun Lei,et al. Advances in deep neural network approaches to speaker recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Mikhail Kotov,et al. Language Identification Using Time Delay Neural Network D-Vector on Short Utterances , 2016, SPECOM.
[8] Douglas A. Reynolds,et al. Deep Neural Network Approaches to Speaker and Language Recognition , 2015, IEEE Signal Processing Letters.
[9] Ming Li,et al. Generalized I-vector Representation with Phonetic Tokenizations and Tandem Features for both Text Independent and Text Dependent Speaker Verification , 2015, Journal of Signal Processing Systems.
[10] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[11] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[12] Tomás Pajdla,et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[13] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..
[14] Joaquín González-Rodríguez,et al. Automatic language identification using deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Yun Lei,et al. A novel scheme for speaker recognition using a phonetically-aware deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Sanjeev Khudanpur,et al. Deep neural network-based speaker embeddings for end-to-end speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[17] Daniel Garcia-Romero,et al. Time delay deep neural network-based universal background models for speaker recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[18] Sri Harish Reddy Mallidi,et al. Exploiting Hidden-Layer Responses of Deep Neural Networks for Language Recognition , 2016, INTERSPEECH.
[19] Aaron E. Rosenberg,et al. Report: A vector quantization approach to speaker recognition , 1987, AT&T Technical Journal.
[20] Ming Li,et al. Speaker verification and spoken language identification using a generalized i-vector framework with phonetic tokenizations and tandem features , 2014, INTERSPEECH.
[21] Joaquín González-Rodríguez,et al. Automatic language identification using long short-term memory recurrent neural networks , 2014, INTERSPEECH.
[22] Douglas E. Sturim,et al. Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.
[23] Haizhou Li,et al. An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..
[24] Xiao Liu,et al. Deep Speaker: an End-to-End Neural Speaker Embedding System , 2017, ArXiv.