Speaker Diarization with Enhancing Speech for the First DIHARD Challenge
暂无分享,去创建一个
Jun Du | Chin-Hui Lee | Lei Sun | Chao Jiang | Bing Yin | Shan He | Xueyang Zhang
[1] Jun Du,et al. Multiple-target deep learning for LSTM-RNN based speech enhancement , 2017, 2017 Hands-free Speech Communications and Microphone Arrays (HSCMA).
[2] P. Somervuo,et al. Bayesian Analysis of Speaker Diarization with Eigenvoice Priors , 2008 .
[3] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Yan Song,et al. Improved i-Vector Representation for Speaker Diarization , 2016, Circuits Syst. Signal Process..
[5] Radu Horaud,et al. Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[7] Sue Tranter. Two-way cluster voting to improve speaker diarisation performance , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[8] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[9] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[11] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[12] Xiao Liu,et al. Deep Speaker: an End-to-End Neural Speaker Embedding System , 2017, ArXiv.
[13] Jean-François Bonastre,et al. The ELISA consortium approaches in broadcast news speaker segmentation during the NIST 2003 rich transcription evaluation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[14] Xiao-Lei Zhang,et al. Deep Belief Networks Based Voice Activity Detection , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Jun Du,et al. A Novel LSTM-Based Speech Preprocessor for Speaker Diarization in Realistic Mismatch Conditions , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] DeLiang Wang,et al. Investigation of Speech Separation as a Front-End for Noise Robust Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[18] Nicholas W. D. Evans,et al. System output combination for improved speaker diarization , 2010, INTERSPEECH.
[19] Jun Du,et al. A universal VAD based on jointly trained deep neural networks , 2015, INTERSPEECH.
[20] Florin Curelaru,et al. Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).
[21] DeLiang Wang,et al. Ideal ratio mask estimation using deep neural networks for robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] James H. Elder,et al. Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[23] Nicholas W. D. Evans,et al. Speaker Diarization: A Review of Recent Research , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[24] Jun Du,et al. SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement , 2016, INTERSPEECH.
[25] Jun Du,et al. An Experimental Study on Speech Enhancement Based on Deep Neural Networks , 2014, IEEE Signal Processing Letters.
[26] Mark Liberman,et al. Speech activity detection on youtube using deep neural networks , 2013, INTERSPEECH.
[27] Christian A. Müller,et al. Fusing short term and long term features for improved speaker diarization , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[28] Petr Motlícek,et al. Integrating online i-vector extractor with information bottleneck based speaker diarization system , 2015, INTERSPEECH.
[29] Shrikanth S. Narayanan,et al. A robust stopping criterion for agglomerative hierarchical clustering in a speaker diarization system , 2007, INTERSPEECH.
[30] Zhou Yu,et al. Enhancement and Analysis of Conversational Speech: JSALT 2017 , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Marijn Huijbregts,et al. The ICSI RT07s Speaker Diarization System , 2007, CLEAR.
[32] Fabio Valente,et al. An Information Theoretic Approach to Speaker Diarization of Meeting Data , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[33] Nicholas W. D. Evans,et al. The lia-eurecom RT'09 speaker diarization system: Enhancements in speaker modelling and cluster purification , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.