Constrained discriminative PLDA training for speaker verification

Many studies have proven the effectiveness of discriminative training for speaker verification based on probabilistic linear discriminative analysis (PLDA) with i-vectors as features. Most of them directly optimize the log-likelihood ratio score function of the PLDA model instead of explicitly train the PLDA model. But this optimization process removes some of the constraints that normally are imposed on the PLDA log likelihood ratio score function. This may deteriorate the verification performance when the amount of training data is limited. In this paper, we first show two constraints which the score function should follow, and then we propose a new constrained discriminative training algorithm which keeps these constraints. Our experiments show that our method obtained significant improvements in the verification performance in the male trials of the telephone speaker verification tasks of NIST SRE08 and SRE10.

[1]  A. P. Dawid,et al.  Generative or Discriminative? Getting the Best of Both Worlds , 2007 .

[2]  James R. Glass,et al.  Exploiting Intra-Conversation Variability for Speaker Diarization , 2011, INTERSPEECH.

[3]  Pietro Laface,et al.  Fast discriminative speaker verification in the i-vector space , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Patrick Kenny,et al.  Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification , 2009, INTERSPEECH.

[5]  Lukás Burget,et al.  Discriminatively trained Probabilistic Linear Discriminant Analysis for speaker verification , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  M. Mak,et al.  Robust Voice Activity Detection for Interview Speech in NIST Speaker Recognition Evaluation , 2010 .

[7]  Bengt J. Borgstrom,et al.  Discriminatively trained Bayesian speaker comparison of i-vectors , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Pietro Laface,et al.  Pairwise Discriminative Speaker Verification in the ${\rm I}$-Vector Space , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Niko Brümmer,et al.  The speaker partitioning problem , 2010, Odyssey.

[10]  Patrick Kenny,et al.  Bayesian Speaker Verification with Heavy-Tailed Priors , 2010, Odyssey.

[11]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[12]  James H. Elder,et al.  Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.

[14]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Daniel Garcia-Romero,et al.  Analysis of i-vector Length Normalization in Speaker Recognition Systems , 2011, INTERSPEECH.