暂无分享,去创建一个
Naoyuki Kanda | Takuya Yoshioka | Jinyu Li | Zhuo Chen | Yu Wu | Shujie Liu | Jian Wu | Sanyuan Chen | Jinyu Li | Naoyuki Kanda | Zhuo Chen | Jian Wu | Yu Wu | Shujie Liu | Sanyuan Chen | Takuya Yoshioka
[1] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Wanxiang Che,et al. Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting , 2020, EMNLP.
[3] Zhong-Qiu Wang,et al. Alternative Objective Functions for Deep Clustering , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Wei Li,et al. VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition , 2020, INTERSPEECH.
[5] John R. Hershey,et al. Hybrid CTC/Attention Architecture for End-to-End Speech Recognition , 2017, IEEE Journal of Selected Topics in Signal Processing.
[6] Ian McLoughlin,et al. Listening and Grouping: An Online Autoregressive Approach for Monaural Speech Separation , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] DeLiang Wang,et al. Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[8] Nima Mesgarani,et al. Speaker-Independent Speech Separation With Deep Attractor Network , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[9] Hakan Erdogan,et al. Multi-Microphone Neural Speech Separation for Far-Field Multi-Talker Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Ming Zhou,et al. Continuous Speech Separation with Conformer , 2020, ArXiv.
[11] Jonathan Le Roux,et al. End-to-End Multi-Speaker Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Chng Eng Siong,et al. SpEx+: A Complete Time Domain Speaker Extraction Network , 2020, INTERSPEECH.
[13] Takuya Yoshioka,et al. End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[15] DeLiang Wang,et al. Enhanced Spectral Features for Distortion-Independent Acoustic Modeling , 2019, INTERSPEECH.
[16] Naoya Takahashi,et al. Recursive speech separation for unknown number of speakers , 2019, INTERSPEECH.
[17] Naoyuki Kanda,et al. Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR , 2019, INTERSPEECH.
[18] Lei Xie,et al. DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement , 2020, INTERSPEECH.
[19] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[20] Yu Cheng,et al. Patient Knowledge Distillation for BERT Model Compression , 2019, EMNLP.
[21] Shinji Watanabe,et al. Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline , 2018, INTERSPEECH.
[22] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[23] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[25] Zhong-Qiu Wang,et al. A Joint Training Framework for Robust Automatic Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Shinji Watanabe,et al. End-to-end Monaural Multi-speaker ASR System without Pretraining , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Takuya Yoshioka,et al. Advances in Online Audio-Visual Meeting Transcription , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[28] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[29] Hermann Ney,et al. Investigation into Joint Optimization of Single Channel Speech Enhancement and Acoustic Modeling for Robust ASR , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Zhuo Chen,et al. An End-to-end Architecture of Online Multi-channel Speech Separation , 2020, INTERSPEECH.
[31] Chengyi Wang,et al. Semantic Mask for Transformer based End-to-End Speech Recognition , 2020, INTERSPEECH.
[32] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[33] Zhuo Chen,et al. Continuous Speech Separation: Dataset and Analysis , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Tomohiro Nakatani,et al. End-to-End Training of Time Domain Audio Separation and Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] DeLiang Wang,et al. Combining Spectral and Spatial Features for Deep Learning Based Blind Speaker Separation , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[36] Takuya Yoshioka,et al. Ultra Fast Speech Separation Model with Teacher Student Learning , 2021, Interspeech.