Diarization resegmentation in the factor analysis subspace

Resegmentation is an important post-processing step to refine the rough boundaries of diarization systems that rely on segment clustering of an initial uniform segmentation. Past work has primarily used a Viterbi resegmentation with MFCC features for this purpose. In this paper, we examine an algorithm for resegmentation that operates instead in factor analysis subspace. By combining this system with a speaker clustering front-end, we yield a diarization error rate of 11.5% on the CALLHOME conversational telephone speech corpus.

[1]  Douglas A. Reynolds,et al.  Diarization of Telephone Conversations Using Factor Analysis , 2010, IEEE Journal of Selected Topics in Signal Processing.

[2]  James R. Glass,et al.  Exploiting Intra-Conversation Variability for Speaker Diarization , 2011, INTERSPEECH.

[3]  James R. Glass,et al.  On the Use of Spectral and Iterative Methods for Speaker Diarization , 2012, INTERSPEECH.

[4]  Themos Stafylakis,et al.  A Study of the Cosine Distance-Based Mean Shift for Telephone Speech Diarization , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[5]  Daniel Garcia-Romero,et al.  Speaker diarization with plda i-vector scoring and unsupervised calibration , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[6]  James R. Glass,et al.  Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Pietro Laface,et al.  Stream-based speaker segmentation using speaker factors and eigenvoices , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  P. Somervuo,et al.  Bayesian Analysis of Speaker Diarization with Eigenvoice Priors , 2008 .