Neural Diarization with Non-Autoregressive Intermediate Attractors
暂无分享,去创建一个
[1] M. Díez,et al. From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization , 2022, INTERSPEECH.
[2] H. Kim,et al. Auxiliary Loss of Transformer with Residual Connection for End-to-End Speaker Diarization , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Leibny Paola García-Perera,et al. Encoder-Decoder Based Attractors for End-to-End Neural Diarization , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4] Kyu J. Han,et al. A Review of Speaker Diarization: Recent Advances with Deep Learning , 2021, Comput. Speech Lang..
[5] Shinji Watanabe,et al. A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation , 2021, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[6] A. Stolcke,et al. End-to-end Neural Diarization: From Transformer to Conformer , 2021, Interspeech.
[7] Scott Wisdom,et al. End-To-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Tatsuya Komatsu,et al. Relaxing the Conditional Independence Assumption of CTC-based ASR by Conditioning on Intermediate Predictions , 2021, Interspeech.
[9] Shinji Watanabe,et al. Intermediate Loss Regularization for CTC-Based Speech Recognition , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Kenneth Ward Church,et al. The Third DIHARD Diarization Challenge , 2020, Interspeech.
[11] Joon Son Chung,et al. Spot the conversation: speaker diarisation in the wild , 2020, INTERSPEECH.
[12] Shinji Watanabe,et al. End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors , 2020, INTERSPEECH.
[13] Naoyuki Kanda,et al. End-to-End Neural Speaker Diarization with Self-Attention , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[14] Naoyuki Kanda,et al. End-to-End Neural Speaker Diarization with Permutation-Free Objectives , 2019, INTERSPEECH.
[15] Kenneth Ward Church,et al. The Second DIHARD Diarization Challenge: Dataset, task, and baselines , 2019, INTERSPEECH.
[16] Shinji Watanabe,et al. Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge , 2018, INTERSPEECH.
[17] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Quan Wang,et al. Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Quan Wang,et al. Speaker Diarization with LSTM , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Alan McCree,et al. Speaker diarization using deep neural network embeddings , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Daniel Povey,et al. MUSAN: A Music, Speech, and Noise Corpus , 2015, ArXiv.
[25] Daniel Garcia-Romero,et al. Speaker diarization with plda i-vector scoring and unsupervised calibration , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[26] James R. Glass,et al. Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[27] H. Bourlard,et al. Interpretation of Multiparty Meetings the AMI and Amida Projects , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.
[28] Xavier Anguera Miró,et al. Acoustic Beamforming for Speaker Diarization of Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[29] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[30] Elizabeth Shriberg,et al. Overlap in Meetings: ASR Effects and Analysis by Dialog Factors, Speakers, and Collection Site , 2006, MLMI.
[31] Andreas Stolcke,et al. The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..