Orthogonality Regularizations for End-to-End Speaker Verification
暂无分享,去创建一个
[1] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[2] Dacheng Tao,et al. Improving Training of Deep Neural Networks via Singular Value Bounding , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Hervé Bredin,et al. TristouNet: Triplet loss for speaker turn embedding , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Yun Lei,et al. A novel scheme for speaker recognition using a phonetically-aware deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Nir Ailon,et al. Deep Metric Learning Using Triplet Network , 2014, SIMBAD.
[6] Basura Fernando,et al. Generalized BackPropagation, Étude De Cas: Orthogonality , 2016, ArXiv.
[7] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Yang Song,et al. Learning Fine-Grained Image Similarity with Deep Ranking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[9] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.
[10] Christopher Joseph Pal,et al. On orthogonality and learning recurrent networks with long term dependencies , 2017, ICML.
[11] Hye-jin Shim,et al. End-to-end losses based on speaker basis vectors and all-speaker hard negative mining for speaker verification , 2019, INTERSPEECH.
[12] Koichi Shinoda,et al. Attentive Statistics Pooling for Deep Speaker Embedding , 2018, INTERSPEECH.
[13] Shuai Wang,et al. BUT System Description to VoxCeleb Speaker Recognition Challenge 2019 , 2019, ArXiv.
[14] Quan Wang,et al. Attention-Based Models for Text-Dependent Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Xiaohan Chen,et al. Can We Gain More from Orthogonality Regularizations in Training Deep CNNs? , 2018, NeurIPS.
[16] Daniel Povey,et al. Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification , 2018, INTERSPEECH.
[17] Razvan Pascanu,et al. Understanding the exploding gradient problem , 2012, ArXiv.
[18] Dengxin Dai,et al. Unified Hypersphere Embedding for Speaker Recognition , 2018, ArXiv.
[19] Emmanuel J. Candès,et al. Decoding by linear programming , 2005, IEEE Transactions on Information Theory.
[20] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[21] Hao Tang,et al. Frame-Level Speaker Embeddings for Text-Independent Speaker Recognition and Analysis of End-to-End Model , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[22] Alan McCree,et al. Improving speaker recognition performance in the domain adaptation challenge using deep neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[23] Sergey Ioffe,et al. Probabilistic Linear Discriminant Analysis , 2006, ECCV.
[24] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Andrew W. Senior,et al. Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.
[26] Yifan Gong,et al. End-to-End attention based text-dependent speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[27] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[28] Razvan Pascanu,et al. Natural Neural Networks , 2015, NIPS.
[29] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[30] Georg Heigold,et al. End-to-end text-dependent speaker verification , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] F. Xavier Roca,et al. Regularizing CNNs with Locally Constrained Decorrelations , 2016, ICLR.
[32] Themos Stafylakis,et al. Deep Neural Networks for extracting Baum-Welch statistics for Speaker Recognition , 2014, Odyssey.
[33] Hye-jin Shim,et al. RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification , 2019, INTERSPEECH.
[34] Xianglong Liu,et al. Orthogonal Weight Normalization: Solution to Optimization over Multiple Dependent Stiefel Manifolds in Deep Neural Networks , 2017, AAAI.
[35] Daniel Garcia-Romero,et al. Time delay deep neural network-based universal background models for speaker recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[36] Shiliang Pu,et al. All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Quan Wang,et al. Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[38] Chunlei Zhang,et al. End-to-End Text-Independent Speaker Verification with Triplet Loss on Short Utterances , 2017, INTERSPEECH.