An End-to-end Architecture of Online Multi-channel Speech Separation
暂无分享,去创建一个
Zhuo Chen | Takuya Yoshioka | Lei Xie | Jinyu Li | Jian Wu | Zhili Tan | Yi Luo | Ed Lin | Jinyu Li | Lei Xie | Zhuo Chen | Yi Luo | Jian Wu | Takuya Yoshioka | Zhili Tan | Ed Lin
[1] Shinji Watanabe,et al. End-to-end Monaural Multi-speaker ASR System without Pretraining , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Xiong Xiao,et al. Multi-Channel Overlapped Speech Recognition with Location Guided Speech Extraction Network , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[3] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4] Lukás Burget,et al. Sequence-discriminative training of deep neural networks , 2013, INTERSPEECH.
[5] Jonathan Le Roux,et al. SDR – Half-baked or Well Done? , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] DeLiang Wang,et al. Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Dong Yu,et al. Recognizing Multi-talker Speech with Permutation Invariant Training , 2017, INTERSPEECH.
[8] Jinyu Li,et al. Progressive Joint Modeling in Unsupervised Single-Channel Overlapped Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[9] Xiaofei Wang,et al. Serialized Output Training for End-to-End Overlapped Speech Recognition , 2020, INTERSPEECH.
[10] Deliang Wang,et al. On Spatial Features for Supervised Speech Separation and its Application to Beamforming and Robust ASR , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Liang Lu,et al. Speech Separation Using Speaker Inventory , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[12] Tomohiro Nakatani,et al. Speaker-Aware Neural Network Based Beamformer for Speaker Extraction in Speech Mixtures , 2017, INTERSPEECH.
[13] E. Habets,et al. Generating sensor signals in isotropic noise fields. , 2007, The Journal of the Acoustical Society of America.
[14] Reinhold Häb-Umbach,et al. Tight Integration of Spatial and Spectral Features for BSS with Deep Clustering Embeddings , 2017, INTERSPEECH.
[15] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .
[16] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[18] Reinhold Haeb-Umbach,et al. Front-end processing for the CHiME-5 dinner party scenario , 2018, 5th International Workshop on Speech Processing in Everyday Environments (CHiME 2018).
[19] Tomohiro Nakatani,et al. Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[20] Nikko Strom,et al. Frequency Domain Multi-channel Acoustic Modeling for Distant Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Yifan Gong,et al. Layer Trajectory LSTM , 2018, INTERSPEECH.
[22] Jun Wang,et al. Extract, Adapt and Recognize: An End-to-End Neural Network for Corrupted Monaural Speech Recognition , 2019, INTERSPEECH.
[23] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Liang Lu,et al. Improving Layer Trajectory LSTM with Future Context Frames , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Sandeep Subramanian,et al. Deep Complex Networks , 2017, ICLR.
[26] Lei Xie,et al. Improved Speaker-Dependent Separation for CHiME-5 Challenge , 2019, INTERSPEECH.
[27] Xiong Xiao,et al. Low-latency Speaker-independent Continuous Speech Separation , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Xiong Xiao,et al. Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks , 2018, INTERSPEECH.
[29] Naoyuki Kanda,et al. Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition , 2019, INTERSPEECH.
[30] Shinji Watanabe,et al. End-to-End SpeakerBeam for Single Channel Target Speech Recognition , 2019, INTERSPEECH.
[31] Takuya Yoshioka,et al. Advances in Online Audio-Visual Meeting Transcription , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).