Front-end Diversity in Fused Speaker Recognition Systems

Due to the increasing use of fusion in speaker recognition systems, one thread of current research activity focuses on new features to complement MFCCs that can advance the current state of the art in fused systems. In this paper, we investigate some possible variations to the extraction of MFCCs that produce diversity with respect to fused subsystems based on different MFCC-variant features. In particular, the use of different filter shapes is found to provide modified MFCCs that perform promisingly in fused systems, providing the filterbank ripple is minimised. Evaluations on the NIST 2006 SRE database show a relative improvement of 17% in EER when one modified MFCC subsystem is fused with a conventional MFCC-based system, and an improvement of 22% when two modified MFCC subsystems are fused.

[1]  Eliathamby Ambikairajah,et al.  Investigation of Spectral Centroid Magnitude and Frequency for Speaker Recognition , 2010, Odyssey.

[2]  Rong Tong,et al.  The I4U system in NIST 2008 speaker recognition evaluation , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Rong Tong,et al.  Spoken Language Recognition Using Ensemble Classifiers , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Pietro Laface,et al.  Loquendo - Politecnico di Torino's 2008 NIST speaker recognition evaluation system , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Jianwu Dang,et al.  An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification , 2008, Speech Commun..

[6]  E. Ambikairajah,et al.  Extraction of FM components from speech signals using all-pole model , 2008 .

[7]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[8]  Richard J. Mammone,et al.  Speaker recognition - general classifier approaches and data fusion methods , 2002, Pattern Recognit..

[9]  Samy Bengio,et al.  Spectral Subband Centroids as Complementary Features for Speaker Authentication , 2004, ICBA.

[10]  Eliathamby Ambikairajah,et al.  LS regularization of group delay features for speaker recognition , 2009, INTERSPEECH.

[11]  William M. Campbell,et al.  Advances in channel compensation for SVM speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[12]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..