Neural Speech Turn Segmentation and Affinity Propagation for Speaker Diarization
暂无分享,去创建一个
[1] Hervé Bredin,et al. TristouNet: Triplet loss for speaker turn embedding , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Jean-Luc Gauvain,et al. Spoken Language Identification Using LSTM-Based Angular Proximity , 2017, INTERSPEECH.
[3] Sylvain Meignier,et al. S4D: Speaker Diarization Toolkit in Python , 2018, INTERSPEECH.
[4] Guillaume Gravier,et al. The ESTER phase II evaluation campaign for the rich transcription of French broadcast news , 2005, INTERSPEECH.
[5] Hervé Bredin,et al. pyannote.metrics: A Toolkit for Reproducible Evaluation, Diagnostic, and Error Analysis of Speaker Diarization Systems , 2017, INTERSPEECH.
[6] Kong-Aik Lee,et al. An extensible speaker identification sidekit in Python , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] David D. Cox,et al. Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms , 2013, SciPy.
[8] Yonghong Yan,et al. A novel speaker clustering algorithm via supervised affinity propagation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Delbert Dueck,et al. Clustering by Passing Messages Between Data Points , 2007, Science.
[10] Mickael Rouvier,et al. An open-source state-of-the-art toolbox for broadcast news diarization , 2013, INTERSPEECH.
[11] Olivier Galibert,et al. The ETAPE corpus for the evaluation of speech-based TV content processing in the French language , 2012, LREC.
[12] Quan Wang,et al. Speaker Diarization with LSTM , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Thomas Fillon,et al. YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software , 2010, ISMIR.
[14] Sylvain Meignier,et al. LIUM SPKDIARIZATION: AN OPEN SOURCE TOOLKIT FOR DIARIZATION , 2010 .
[15] Jean-Luc Gauvain,et al. Improving Speaker Diarization , 2004 .
[16] Tomi Kinnunen,et al. INTERSPEECH 2013 14thAnnual Conference of the International Speech Communication Association , 2013, Interspeech 2015.
[17] David D. Cox,et al. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.
[18] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[19] M. A. Siegler,et al. Automatic Segmentation, Classification and Clustering of Broadcast News Audio , 1997 .
[20] Guillaume Wisniewski,et al. Combining Speaker Turn Embedding and Incremental Structure Prediction for Low-Latency Speaker Diarization , 2017, INTERSPEECH.
[21] S. Chen,et al. Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion , 1998 .
[22] Jean-Luc Gauvain,et al. Minimum word error training of RNN-based voice activity detection , 2015, INTERSPEECH.
[23] Olivier Galibert,et al. The REPERE Corpus : a multimodal corpus for person recognition , 2012, LREC.
[24] Nicholas W. D. Evans,et al. Speaker Diarization: A Review of Recent Research , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[25] Claude Barras,et al. Speaker Change Detection in Broadcast TV Using Bidirectional Long Short-Term Memory Networks , 2017, INTERSPEECH.