Generalized cosine similarity in I-vector based automatic speaker recognition systems

This paper deals with the problem of processing of I-vectors in the text-independent speaker verification systems. A new generalized cosine similarity optimization technique is proposed. The optimization is performed over sets of orthogonal and diagonal matrices. Lie group techniques are used in order to obtain exact orthogonal matrix solution. The experiments have been performed using the NIST 2010 database. It was shown that a substantial equal error rate (EER) reduction can be obtained with the proposed algorithm.

[1]  Andreas Stolcke,et al.  Within-class covariance normalization for SVM-based speaker recognition , 2006, INTERSPEECH.

[2]  Adam Dabrowski,et al.  Speaker Recognition Based on Multilevel Speech Signal Analysis on Polish Corpus , 2012, MCSS.

[3]  Mark D. Plumbley Geometrical methods for non-negative ICA: Manifolds, Lie groups and toral subalgebras , 2005, Neurocomputing.

[4]  S. Drgas,et al.  Detection of GSM speech coding for telephone call classification and automatic speaker recognition , 2008, 2008 International Conference on Signals and Electronic Systems.

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  A. Ganapathiraju,et al.  Echo cancellation for evaluating speaker identification technology , 1997, Proceedings IEEE SOUTHEASTCON '97. 'Engineering the New Century'.

[7]  Radoslaw Weychan,et al.  Speaker recognition based on short polish sequences , 2010, Signal Processing Algorithms, Architectures, Arrangements, and Applications SPA 2010.

[8]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[9]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[10]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Douglas E. Sturim,et al.  Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[12]  Alvin F. Martin,et al.  NIST Speaker Recognition Evaluations Utilizing the Mixer Corpora—2004, 2005, 2006 , 2007, IEEE Transactions on Audio, Speech, and Language Processing.