Speaker Verification via Estimating Total Variability Space Using Probabilistic Partial Least Squares

The i-vector framework is one of the most popular methods in speaker verification, and estimating a total variability space (TVS) is a key part in the i-vector framework. Current estimation methods pay less attention on the discrimination of TVS, but the discrimination is so important that it will influence the improvement of performance. So we focus on the discrimination of TVS to achieve a better performance. In this paper, a discriminative estimating method of TVS based on probabilistic partial least squares (PPLS) is proposed. In this method, the discrimination is improved by using the priori information (labels) of speaker, so both the correlation of intra-class and the discrimination of interclass are fully utilized. Meanwhile, it also introduces a probabilistic view of the partial least squares (PLS) method to overcome the disadvantage of high computational complexity and the inability of channel compensation. And also this proposed method can achieve a better performance than the traditional TVS estimation method as well as the PLS-based method.

[1]  Bin Ma,et al.  Local Variability Modeling for Text-Independent Speaker Verification , 2014, Odyssey.

[2]  Liqing Zhang,et al.  Multilinear and nonlinear generalizations of partial least squares: an overview of recent advances , 2014, WIREs Data Mining Knowl. Discov..

[3]  John H. L. Hansen,et al.  Speaker Recognition by Machines and Humans: A tutorial review , 2015, IEEE Signal Processing Magazine.

[4]  Wei Wang,et al.  Probabilistic partial least squares regression for quantitative analysis of Raman spectra , 2015, Int. J. Data Min. Bioinform..

[5]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[6]  James H. Elder,et al.  Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[7]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[8]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[9]  Florin Curelaru,et al.  Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).

[10]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[11]  Syed Abdul Rahman Al-Haddad,et al.  Distant Speaker Recognition: An Overview , 2016, Int. J. Humanoid Robotics.

[12]  Douglas A. Reynolds,et al.  Summary and initial results of the 2013-2014 speaker recognition i-vector machine learning challenge , 2014, INTERSPEECH.

[13]  Balaji Vasan Srinivasan,et al.  A partial least squares framework for speaker recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Baoju Zhang,et al.  A Unified Probabilistic PLSR Model for Quantitative Analysis of Surface-Enhanced Raman Spectrum (SERS) , 2014, ICC 2014.

[15]  Chin-Hui Lee,et al.  Minimax i-vector extractor for short duration speaker verification , 2013, INTERSPEECH.

[16]  Patrick Kenny,et al.  A Study of Interspeaker Variability in Speaker Verification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Yingchun Yang,et al.  Maximum Likelihood i-vector Space Using PCA for Speaker Verification , 2011, INTERSPEECH.

[18]  Patrick Kenny,et al.  Eigenvoice modeling with sparse training data , 2005, IEEE Transactions on Speech and Audio Processing.

[19]  Tahir Mehmood,et al.  The diversity in the applications of partial least squares: an overview , 2016 .

[20]  Jean X. Gao,et al.  Probabilistic Partial Least Square Regression: A Robust Model for Quantitative Analysis of Raman Spectroscopy Data , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[21]  Patrick Kenny,et al.  Joint Factor Analysis Versus Eigenchannels in Speaker Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.