ICSpk: Interpretable Complex Speaker Embedding Extractor from Raw Waveform
暂无分享,去创建一个
Jing Xiao | Jianzong Wang | Xiaoyang Qu | Lukáš Burget | Jan Černocký | Rongzhi Gu | Junyi Peng | L. Burget | Junyi Peng | Xiaoyang Qu | Rongzhi Gu | Jianzong Wang | Jing Xiao | J. Černocký
[1] Thomas Fang Zheng,et al. CN-Celeb: multi-genre speaker recognition , 2020, Speech Commun..
[2] Marco Tagliasacchi,et al. LEAF: A Learnable Frontend for Audio Classification , 2021, ICLR.
[3] Man-Wai Mak,et al. Wav2Spk: A Simple DNN Architecture for Learning Speaker Embeddings from Waveforms , 2020, INTERSPEECH.
[4] Steve Renals,et al. A Deep 2D Convolutional Network for Waveform-Based Speech Recognition , 2020, INTERSPEECH.
[5] Yuexian Zou,et al. Deep Speaker Embedding with Long Short Term Centroid Learning for Text-Independent Speaker Verification , 2020, INTERSPEECH.
[6] Zhiyao Duan,et al. Raw-x-vector: Multi-scale Time Domain Speaker Embedding Network , 2020, ArXiv.
[7] Jee-weon Jung,et al. Improved RawNet with Filter-wise Rescaling for Text-independent Speaker Verification using Raw Waveforms , 2020, INTERSPEECH.
[8] Joon Son Chung,et al. In defence of metric learning for speaker recognition , 2020, INTERSPEECH.
[9] Dong Yu,et al. Multi-Modal Multi-Channel Target Speech Separation , 2020, IEEE Journal of Selected Topics in Signal Processing.
[10] Dong Wang,et al. CN-Celeb: A Challenging Chinese Speaker Recognition Dataset , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Steve Renals,et al. On Learning Interpretable CNNs with Parametric Modulated Kernel-Based Filters , 2019, INTERSPEECH.
[12] Hye-jin Shim,et al. RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification , 2019, INTERSPEECH.
[13] Jung-Woo Ha,et al. Phase-aware Speech Enhancement with Deep Complex U-Net , 2019, ICLR.
[14] Yoshua Bengio,et al. Speaker Recognition from Raw Waveform with SincNet , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[15] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[16] Ming Li,et al. Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System , 2018, Odyssey.
[17] Sébastien Marcel,et al. Towards Directly Modeling Raw Speech Signal for Speaker Verification Using CNNS , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Koichi Shinoda,et al. Attentive Statistics Pooling for Deep Speaker Embedding , 2018, INTERSPEECH.
[19] Quan Wang,et al. Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Sandeep Subramanian,et al. Deep Complex Networks , 2017, ICLR.
[21] Chunlei Zhang,et al. End-to-End Text-Independent Speaker Verification with Triplet Loss on Short Utterances , 2017, INTERSPEECH.
[22] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[23] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[24] Ronald W. Schafer,et al. Theory and Applications of Digital Speech Processing , 2010 .
[25] Jianwu Dang,et al. An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification , 2008, Speech Commun..
[26] Jr. J.P. Campbell,et al. Speaker recognition: a tutorial , 1997, Proc. IEEE.