Speaker diarization assisted ASR for multi-speaker conversations
暂无分享,去创建一个
[1] Jonathan Le Roux,et al. A Purely End-to-End System for Multi-speaker Speech Recognition , 2018, ACL.
[2] Naoyuki Kanda,et al. End-to-End Neural Speaker Diarization with Self-Attention , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[3] Jan Silovský,et al. PLDA-Based Clustering for Speaker Diarization of Broadcast Streams , 2011, INTERSPEECH.
[4] Jordi Luque,et al. On the use of agglomerative and spectral clustering in speaker diarization of meetings , 2012, Odyssey.
[5] Hagen Soltau,et al. Joint Speech Recognition and Speaker Diarization via Sequence Transduction , 2019, INTERSPEECH.
[6] Naoyuki Kanda,et al. Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of Any Number of Speakers , 2020, INTERSPEECH.
[7] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[8] Tomohiro Nakatani,et al. SpeakerBeam: Speaker Aware Neural Network for Target Speaker Extraction in Speech Mixtures , 2019, IEEE Journal of Selected Topics in Signal Processing.
[9] Nima Mesgarani,et al. TaSNet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[11] Steve Young,et al. HMMs and related speech recognition technologies , 2008 .
[12] Jon Barker,et al. CHiME-6 Challenge: Tackling Multispeaker Speech Recognition for Unsegmented Recordings , 2020, 6th International Workshop on Speech Processing in Everyday Environments (CHiME 2020).
[13] Dong Yu,et al. Recognizing Multi-talker Speech with Permutation Invariant Training , 2017, INTERSPEECH.
[14] Xiaofei Wang,et al. Investigation of End-to-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[15] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[16] Naoyuki Kanda,et al. End-to-End Neural Speaker Diarization with Permutation-Free Objectives , 2019, INTERSPEECH.
[17] James H. Elder,et al. Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[18] Sanjeev Khudanpur,et al. Speaker Recognition for Multi-speaker Conversations Using X-vectors , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).