Improving Multi-Scale Aggregation Using Feature Pyramid Module for Robust Speaker Verification of Variable-Duration Utterances
暂无分享,去创建一个
Hoirin Kim | Seong Min Kye | Youngmoon Jung | Yeunju Choi | Myunghun Jung | Yeunju Choi | Youngmoon Jung | Hoirin Kim | Myunghun Jung
[1] Xiao Liu,et al. Deep Speaker: an End-to-End Neural Speaker Embedding System , 2017, ArXiv.
[2] Koichi Shinoda,et al. Attentive Statistics Pooling for Deep Speaker Embedding , 2018, INTERSPEECH.
[3] Wu-Jun Li,et al. Ensemble Additive Margin Softmax for Speaker Verification , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Hoirin Kim,et al. Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification , 2019, INTERSPEECH.
[5] Bowen Zhou,et al. Deep Speaker Embedding Learning with Multi-level Pooling for Text-independent Speaker Verification , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Ming Li,et al. Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System , 2018, Odyssey.
[7] Ian McLoughlin,et al. Improving Aggregation and Loss Function for Better Embedding Learning in End-to-End Speaker Verification System , 2019, INTERSPEECH.
[8] Bhiksha Raj,et al. SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] John H. L. Hansen,et al. Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] Daniel Garcia-Romero,et al. Analysis of i-vector Length Normalization in Speaker Recognition Systems , 2011, INTERSPEECH.
[11] Hoirin Kim,et al. Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention , 2020, INTERSPEECH.
[12] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[13] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[14] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Changmin Kim,et al. Shortcut Connections Based Deep Speaker Embeddings for End-to-End Speaker Verification System , 2019, INTERSPEECH.
[16] Hoirin Kim,et al. Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs , 2020, INTERSPEECH.
[17] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[18] Serge J. Belongie,et al. Residual Networks Behave Like Ensembles of Relatively Shallow Networks , 2016, NIPS.
[19] Patrick Kenny,et al. Bayesian Speaker Verification with Heavy-Tailed Priors , 2010, Odyssey.
[20] Sergey Ioffe,et al. Probabilistic Linear Discriminant Analysis , 2006, ECCV.
[21] Hoirin Kim,et al. Self-Adaptive Soft Voice Activity Detection Using Deep Neural Networks for Robust Speaker Verification , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[22] Ian McLoughlin,et al. An Effective Deep Embedding Learning Architecture for Speaker Verification , 2019, INTERSPEECH.
[23] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Joon Son Chung,et al. Utterance-level Aggregation for Speaker Recognition in the Wild , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Amirhossein Hajavi,et al. A Deep Neural Network for Short-Segment Speaker Recognition , 2019, INTERSPEECH.
[28] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.