Investigating Deep Neural Networks for Speaker Diarization in the DIHARD Challenge
暂无分享,去创建一个
[1] Oliver Durr,et al. Speaker identification and clustering using convolutional neural networks , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).
[2] Georg Heigold,et al. End-to-end text-dependent speaker verification , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[4] Sridha Sridharan,et al. Improving out-domain PLDA speaker verification using unsupervised inter-dataset variability compensation approach , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Patrick Kenny,et al. Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms , 2006 .
[6] Sridha Sridharan,et al. The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms , 2010, INTERSPEECH.
[7] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[8] Thilo Stadelmann,et al. Speaker identification and clustering using convolutional neural networks , 2016 .
[9] Florin Curelaru,et al. Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).
[10] Douglas A. Reynolds,et al. Diarization of Telephone Conversations Using Factor Analysis , 2010, IEEE Journal of Selected Topics in Signal Processing.
[11] Marijn Huijbregts,et al. The ICSI RT07s Speaker Diarization System , 2007, CLEAR.
[12] Xiao Liu,et al. Deep Speaker: an End-to-End Neural Speaker Embedding System , 2017, ArXiv.
[13] Alan McCree,et al. Speaker diarization using deep neural network embeddings , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Sanjeev Khudanpur,et al. Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.
[15] Jean-Luc Gauvain,et al. Multistage speaker diarization of broadcast news , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Sridha Sridharan,et al. Dataset-invariant covariance normalization for out-domain PLDA speaker verification , 2015, INTERSPEECH.
[17] Alan McCree,et al. Speaker diarization with i-vectors from DNN senone posteriors , 2015, INTERSPEECH.
[18] Nicholas W. D. Evans,et al. Speaker Diarization: A Review of Recent Research , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Niko Brümmer,et al. Unsupervised Domain Adaptation for I-Vector Speaker Recognition , 2014, Odyssey.
[20] Yun Lei,et al. A novel scheme for speaker recognition using a phonetically-aware deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Sridha Sridharan,et al. Domain Mismatch Modeling of Out-Domain i-Vectors for PLDA Speaker Verification , 2017, INTERSPEECH.
[22] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Itshak Lapidot,et al. On the Use of PLDA i-vector Scoring for Clustering Short Segments , 2016, Odyssey.
[24] M. A. Siegler,et al. Automatic Segmentation, Classification and Clustering of Broadcast News Audio , 1997 .
[25] Oliver Durr,et al. Learning embeddings for speaker clustering based on voice equality , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).
[26] Mickael Rouvier,et al. Speaker diarization through speaker embeddings , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).
[27] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[28] Alan McCree,et al. Priors for Speaker Counting and Diarization with AHC , 2016, INTERSPEECH.
[29] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Tomasz Trzcinski,et al. Speaker Diarization Using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings , 2017, ISAT.
[31] Daniel Garcia-Romero,et al. Speaker diarization with plda i-vector scoring and unsupervised calibration , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[32] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).