Combining Speaker Turn Embedding and Incremental Structure Prediction for Low-Latency Speaker Diarization
暂无分享,去创建一个
Guillaume Wisniewski | Claude Barras | Gregory Gelly | Hervé Bredin | Guillaume Wisniewski | H. Bredin | C. Barras | G. Gelly
[1] Giorgio Satta,et al. Guided Learning for Bidirectional Sequence Classification , 2007, ACL.
[2] Yu Qiao,et al. A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.
[3] Hervé Bredin,et al. pyannote.metrics: A Toolkit for Reproducible Evaluation, Diagnostic, and Error Analysis of Speaker Diarization Systems , 2017, INTERSPEECH.
[4] Daben Liu,et al. Online speaker clustering , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[5] Mohammad Hossein Moattar,et al. A review on speaker diarization systems and approaches , 2012, Speech Commun..
[6] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.
[7] Jean-Luc Gauvain,et al. Multistage speaker diarization of broadcast news , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[8] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Mickael Rouvier,et al. An open-source state-of-the-art toolbox for broadcast news diarization , 2013, INTERSPEECH.
[10] Olivier Galibert,et al. The ETAPE corpus for the evaluation of speech-based TV content processing in the French language , 2012, LREC.
[11] Nicholas W. D. Evans,et al. Speaker Diarization: A Review of Recent Research , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[12] Daniel Marcu,et al. Learning as search optimization: approximate large margin methods for structured prediction , 2005, ICML.
[13] François Yvon,et al. Structured prediction for speaker identification in TV series , 2015, INTERSPEECH.
[14] Guillaume Wisniewski,et al. PanParser: a Modular Implementation for Efficient Transition-Based Dependency Parsing , 2018, Prague Bull. Math. Linguistics.
[15] Tian Zhang,et al. BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.
[16] Thomas Fillon,et al. YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software , 2010, ISMIR.
[17] Brian Roark,et al. Incremental Parsing with the Perceptron Algorithm , 2004, ACL.
[18] Shoei Sato,et al. Low-latency speaker diarization based on Bayesian information criterion with multiple phoneme classes , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Joakim Nivre,et al. Training Deterministic Parsers with Non-Deterministic Oracles , 2013, TACL.
[20] Koby Crammer,et al. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..
[21] Mauro Cettolo. Segmentation, classification and clustering of an Italian broadcast news corpus , 2000 .
[22] Jean-Luc Gauvain,et al. Spoken Language Identification Using LSTM-Based Angular Proximity , 2017, INTERSPEECH.
[23] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..
[24] Satoshi Nakamura,et al. Improved novelty detection for online GMM based speaker diarization , 2008, INTERSPEECH.
[25] Olivier Galibert,et al. The REPERE Corpus : a multimodal corpus for person recognition , 2012, LREC.
[26] Hervé Bredin,et al. TristouNet: Triplet loss for speaker turn embedding , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Joakim Nivre,et al. Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.
[28] Jean-Luc Gauvain,et al. Partitioning and transcription of broadcast news data , 1998, ICSLP.