Improving the PLDA based speaker verification in limited microphone data conditions

A significant amount of speech data is required to develop a robust speaker verification system, but it is difficult to find enough development speech to match all expected conditions. In this paper we introduce a new approach to Gaussian probabilistic linear discriminant analysis (GPLDA) to estimate reliable model parameters as a linearly weighted model taking more input from the large volume of available telephone data and smaller proportional input from limited microphone data. In comparison to a traditional pooled training approach, where the GPLDA model is trained over both telephone and microphone speech, this linear-weighted GPLDA approach is shown to provide better EER and DCF performance in microphone and mixed conditions in both the NIST 2008 and NIST 2010 evaluation corpora. Based upon these results, we believe that linear-weighted GPLDA will provide a better approach than pooled GPLDA, allowing for the further improvement of GPLDA speaker verification in conditions with limited development data.

[1]  Patrick Kenny,et al.  Bayesian Speaker Verification with Heavy-Tailed Priors , 2010, Odyssey.

[2]  The NIST Year 2010 Speaker Recognition Evaluation Plan 1 I NTRODUCTION , 2022 .

[3]  James R. Glass,et al.  Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification , 2010, Odyssey.

[4]  Daniel Garcia-Romero,et al.  Analysis of i-vector Length Normalization in Speaker Recognition Systems , 2011, INTERSPEECH.

[5]  Patrick Kenny,et al.  Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification , 2009, INTERSPEECH.

[6]  Patrick Kenny,et al.  Well-calibrated heavy tailed Bayesian speaker verification for microphone speech , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  James R. Glass,et al.  Cosine Similarity Scoring without Score Normalization Techniques , 2010, Odyssey.

[8]  Patrick Kenny,et al.  A Study of Interspeaker Variability in Speaker Verification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Sridha Sridharan,et al.  On the Use of Factor Analysis with Restricted Target Data in Speaker Verification , 2010, Odyssey.

[10]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Patrick Kenny,et al.  An i-vector Extractor Suitable for Speaker Recognition with both Microphone and Telephone Speech , 2010, Odyssey.

[12]  Sridha Sridharan,et al.  PLDA based speaker verification with weighted LDA techniques , 2012, Odyssey.