EfficientTDNN: Efficient Architecture Search for Speaker Recognition
暂无分享,去创建一个
Shouling Ji | Zhihua Wei | Haoran Duan | Rui Wang | Yang Long | Zhen Hong
[1] Benjamin Barras,et al. SoX : Sound eXchange , 2012 .
[2] Mingxing Tan,et al. EfficientNetV2: Smaller Models and Faster Training , 2021, ICML.
[3] Zhangyang Wang,et al. AutoSpeech: Neural Architecture Search for Speaker Recognition , 2020, INTERSPEECH.
[4] Shuai Wang,et al. Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition , 2019, 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[5] Harsha Vardhan,et al. The Leap Speaker Recognition System for NIST SRE 2018 Challenge , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.
[7] Douglas A. Reynolds,et al. The 2018 NIST Speaker Recognition Evaluation , 2019, INTERSPEECH.
[8] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Chuang Gan,et al. Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.
[10] Song Han,et al. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.
[11] Yun Lei,et al. Advances in deep neural network approaches to speaker recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Jing Xiao,et al. Evolutionary Algorithm Enhanced Neural Architecture Search for Text-Independent Speaker Verification , 2020, INTERSPEECH.
[13] Shuai Wang,et al. Joint I-Vector with End-to-End System for Short Duration Text-Independent Speaker Verification , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Kai Zhao,et al. Res2Net: A New Multi-Scale Backbone Architecture , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Witold Pedrycz,et al. Linguistic models and linguistic modeling , 1999, IEEE Trans. Syst. Man Cybern. Part B.
[16] Sanjeev Khudanpur,et al. A study on data augmentation of reverberant speech for robust speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Hung-yi Lee,et al. DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation , 2020, INTERSPEECH.
[18] Samin Ishtiaq,et al. NAS-Bench-ASR: Reproducible Neural Architecture Search for Speech Recognition , 2021, ICLR.
[19] Daniel Povey,et al. MUSAN: A Music, Speech, and Noise Corpus , 2015, ArXiv.
[20] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[21] Dengxin Dai,et al. Unified Hypersphere Embedding for Speaker Recognition , 2018, ArXiv.
[22] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[24] Longhui Wei,et al. Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap , 2020, ACM Comput. Surv..
[25] Joon Son Chung,et al. VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge , 2019, ArXiv.
[26] Xiangyu Zhang,et al. Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.
[27] Pooyan Safari,et al. Self-attention encoding and pooling for speaker recognition , 2020, INTERSPEECH.
[28] Mathieu Salzmann,et al. How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS , 2020, ArXiv.
[29] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[30] Wu-Jun Li,et al. Densely Connected Time Delay Neural Network for Speaker Verification , 2020, INTERSPEECH.
[31] Joon Son Chung,et al. Utterance-level Aggregation for Speaker Recognition in the Wild , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Zhijian Ou,et al. Efficient Neural Architecture Search for End-to-End Speech Recognition Via Straight-Through Gradients , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[33] Quoc V. Le,et al. Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.
[34] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[35] Longbiao Wang,et al. ARET: Aggregated Residual Extended Time-Delay Neural Networks for Speaker Verification , 2020, INTERSPEECH.
[36] Ji Liu,et al. SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification , 2021, ArXiv.
[37] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[38] Joon Son Chung,et al. Clova Baseline System for the VoxCeleb Speaker Recognition Challenge 2020 , 2020, ArXiv.
[39] Kris Demuynck,et al. ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification , 2020, INTERSPEECH.
[40] Joon Son Chung,et al. Voxceleb: Large-scale speaker verification in the wild , 2020, Comput. Speech Lang..
[41] Witold Pedrycz,et al. The design of cognitive maps: A study in synergy of granular computing and evolutionary optimization , 2010, Expert Syst. Appl..
[42] Lawrence Carin,et al. Learning Autoencoders with Relational Regularization , 2020, ICML.
[43] Enhong Chen,et al. Lightspeech: Lightweight and Fast Text to Speech with Neural Architecture Search , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[45] Joon Son Chung,et al. In defence of metric learning for speaker recognition , 2020, INTERSPEECH.
[46] Tan Lee,et al. Text-Independent Speaker Verification with Dual Attention Network , 2020, INTERSPEECH.
[47] Stefanos Zafeiriou,et al. ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Alan McCree,et al. MagNetO: X-vector Magnitude Estimation Network plus Offset for Improved Speaker Recognition , 2020, Odyssey.
[49] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[50] Aaron Lawson,et al. The Speakers in the Wild (SITW) Speaker Recognition Database , 2016, INTERSPEECH.