Detecting and Counting Overlapping Speakers in Distant Speech Scenarios
暂无分享,去创建一个
Stefano Squartini | Maurizio Omologo | Emmanuel Vincent | Samuele Cornell | E. Vincent | S. Squartini | M. Omologo | Samuele Cornell
[1] Gerald Friedland,et al. Overlapped speech detection for improved speaker diarization in multiparty meetings , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[2] Valentin Andrei,et al. Detecting Overlapped Speech on Short Timeframes Using Deep Learning , 2017, INTERSPEECH.
[3] Janez Demsar,et al. Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..
[4] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[5] Emanuel A. P. Habets,et al. Classification vs. Regression in Supervised Learning for Single Channel Speaker Count Estimation , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Horia Cucu,et al. Overlapped Speech Detection and Competing Speaker Counting–‐Humans Versus Deep Learning , 2019, IEEE Journal of Selected Topics in Signal Processing.
[7] Bernd Edler,et al. CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[8] Javier Ramírez,et al. Efficient voice activity detection algorithms using long-term speech information , 2004, Speech Commun..
[9] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Leibny Paola García-Perera,et al. Overlap-Aware Diarization: Resegmentation Using Neural End-to-End Overlapped Speech Detection , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Mari Ostendorf,et al. Efficient use of overlap information in speaker diarization , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[12] Gerald Friedland,et al. Two's a crowd: improving speaker diarization by automatically identifying and excluding overlapped speech , 2008, INTERSPEECH.
[13] Morgan Sonderegger,et al. Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi , 2017, INTERSPEECH.
[14] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[15] Shinji Watanabe,et al. Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge , 2018, INTERSPEECH.
[16] Björn W. Schuller,et al. Detecting overlapping speech with long short-term memory recurrent neural networks , 2013, INTERSPEECH.
[17] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[18] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[19] Jon Barker,et al. CHiME-6 Challenge: Tackling Multispeaker Speech Recognition for Unsegmented Recordings , 2020, 6th International Workshop on Speech Processing in Everyday Environments (CHiME 2020).
[20] Neville Ryant,et al. Leveraging LSTM Models for Overlap Detection in Multi-Party Meetings , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Florian Metze,et al. New Era for Robust Speech Recognition , 2017, Springer International Publishing.
[22] Vladlen Koltun,et al. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.
[23] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[24] Liyuan Liu,et al. On the Variance of the Adaptive Learning Rate and Beyond , 2019, ICLR.
[25] Yifan Gong,et al. Robust automatic speech recognition : a bridge to practical application , 2015 .
[26] Zhou Yu,et al. Enhancement and Analysis of Conversational Speech: JSALT 2017 , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Xin Wang,et al. Speaker detection in the wild: Lessons learned from JSALT 2019 , 2019, Odyssey.
[28] Emmanuel Vincent,et al. Audio Source Separation and Speech Enhancement , 2018 .
[29] Heiga Zen,et al. Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques , 2019, IEEE Signal Processing Magazine.
[30] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[31] Marie Kunesová,et al. Detection of Overlapping Speech for the Purposes of Speaker Diarization , 2019, SPECOM.
[32] Kenneth Ward Church,et al. The Second DIHARD Diarization Challenge: Dataset, task, and baselines , 2019, INTERSPEECH.
[33] Jean Carletta,et al. The AMI meeting corpus , 2005 .
[34] Mireia Díez,et al. BUT System for DIHARD Speech Diarization Challenge 2018 , 2018, INTERSPEECH.
[35] Antonio Miguel,et al. gpuRIR: A python library for room impulse response simulation with GPU acceleration , 2018, Multimedia Tools and Applications.