Intra-class variation reduction of speaker representation in disentanglement framework
暂无分享,去创建一个
[1] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[2] Sanjeev Khudanpur,et al. Deep neural network-based speaker embeddings for end-to-end speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[3] Kate Saenko,et al. Domain Agnostic Learning with Disentangled Representations , 2019, ICML.
[4] Xiaoqi Jia,et al. SEF-ALDR: A Speaker Embedding Framework via Adversarial Learning based Disentangled Representation , 2019 .
[5] Hoirin Kim,et al. Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs , 2020, INTERSPEECH.
[6] Aaron C. Courville,et al. MINE: Mutual Information Neural Estimation , 2018, ArXiv.
[7] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.
[8] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[9] Yifan Gong,et al. Adversarial Speaker Verification , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Joon Son Chung,et al. Utterance-level Aggregation for Speaker Recognition in the Wild , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[12] Xiao Liu,et al. Deep Speaker: an End-to-End Neural Speaker Embedding System , 2017, ArXiv.
[13] Patrick Kenny,et al. Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-end Speaker Verification , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Quan Wang,et al. Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Li-Rong Dai,et al. Non-Parallel Sequence-to-Sequence Voice Conversion With Disentangled Linguistic and Speaker Representations , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Ming Li,et al. Analysis of Length Normalization in End-to-End Speaker Verification System , 2018, INTERSPEECH.
[17] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[18] Mathieu Serrurier,et al. Learning Disentangled Representations via Mutual Information Estimation , 2019, ECCV.
[19] Sanjeev Khudanpur,et al. Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.
[20] Georg Heigold,et al. End-to-end text-dependent speaker verification , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Shuai Wang,et al. Angular Softmax for Short-Duration Text-independent Speaker Verification , 2018, INTERSPEECH.
[22] Hoirin Kim,et al. Improving Multi-Scale Aggregation Using Feature Pyramid Module for Robust Speaker Verification of Variable-Duration Utterances , 2020, INTERSPEECH.
[23] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] S. Varadhan,et al. Asymptotic evaluation of certain Markov process expectations for large time , 1975 .
[25] Joost van de Weijer,et al. Image-to-image translation for cross-domain disentanglement , 2018, NeurIPS.
[26] Joon Son Chung,et al. In defence of metric learning for speaker recognition , 2020, INTERSPEECH.
[27] Yoshua Bengio,et al. Learning Speaker Representations with Mutual Information , 2018, INTERSPEECH.
[28] Bumsub Ham,et al. Learning Disentangled Representation for Robust Person Re-identification , 2019, NeurIPS.
[29] Yu Liu,et al. Exploring Disentangled Feature Representation Beyond Face Identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[30] Dengxin Dai,et al. Unified Hypersphere Embedding for Speaker Recognition , 2018, ArXiv.
[31] Ming Li,et al. Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System , 2018, Odyssey.
[32] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[34] Tao Jiang,et al. Training Multi-task Adversarial Network for Extracting Noise-robust Speaker Embedding , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Lin-Shan Lee,et al. Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations , 2018, INTERSPEECH.