ALIZE/spkdet: a state-of-the-art open source software for speaker recognition

This paper presents the ALIZE/SpkDet open source software packages for text independent speaker recognition. This software is based on the well-known UBM/GMM approach. It includes also the latest speaker recognition developments such as Latent Factor Analysis (LFA) and unsupervised adaptation. Discriminant classifiers such as SVM supervectors are also provided , linked with the Nuisance Attribute Projection (NAP). The software performance is demonstrated within the framework of the NIST'06 SRE evaluation campaign. Several other applications like speaker diarization, embedded speaker recognition , password dependent speaker recognition and pathological voice assessment are also presented.

[1]  Douglas E. Sturim,et al.  SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2]  Alvin F. Martin,et al.  NIST Speaker Recognition Evaluation Chronicles - Part 2 , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[3]  Douglas A. Reynolds,et al.  A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[4]  Vincent Wan,et al.  Speaker verification using support vector machines , 2003 .

[5]  Jean-Claude Junqua,et al.  Gaussian dynamic warping (GDW) method applied to text-dependent speaker detection and verification , 2003, INTERSPEECH.

[6]  Nicholas W. D. Evans,et al.  The influence of speech activity detection and overlap on speaker diarization for meeting room recordings , 2007, INTERSPEECH.

[7]  Driss Matrouf,et al.  State-of-the-Art Performance in Text-Independent Speaker Verification Through Open-Source Software , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Patrick Kenny,et al.  Eigenvoice modeling with sparse training data , 2005, IEEE Transactions on Speech and Audio Processing.

[9]  Eric G. Hansen,et al.  Supervised and Unsupervised Speaker Adaptation in the NIST 2005 Speaker Recognition Evaluation , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[10]  Krzysztof Kryszczuk,et al.  Reliability-Based Decision Fusion in Multimodal Biometric Verification Systems , 2007, EURASIP J. Adv. Signal Process..

[11]  Roland Auckenthaler,et al.  Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..

[12]  Alexandre Preti,et al.  A Continuous Unsupervised Adaptation Method For Speaker Verification , 2007 .

[13]  Jean-François Bonastre,et al.  Frequency study for the characterization of the dysphonic voices , 2007, INTERSPEECH.

[14]  Alvin F. Martin,et al.  NIST speaker recognition evaluation chronicles , 2004, Odyssey.

[15]  Samy Bengio,et al.  A kernel trick for sequences applied to text-independent speaker verification systems , 2007, Pattern Recognit..

[16]  William M. Campbell,et al.  Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..

[17]  Driss Matrouf,et al.  Confidence measure based unsupervised target model adaptation for speaker verification , 2007, INTERSPEECH.

[18]  Patrick Kenny,et al.  Factor analysis simplified [speaker verification applications] , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[19]  Jean-Luc Gauvain,et al.  Unsupervised online adaptation for speaker verification over the telephone , 2004, Odyssey.

[20]  Jean-François Bonastre,et al.  ALIZE, a free toolkit for speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[21]  Larry P. Heck,et al.  On-Line Unsupervised Adaptation in Speaker Verification: Confidence-Based Updates and Improved Parameter Estimation , 2001 .

[22]  Jean-François Bonastre,et al.  Characterization of the Pathological Voices (Dysphonia) in the frequency space , 2007 .

[23]  Larry Heck Unsupervised On-Line Adaptation in Speaker Verification , 2000 .

[24]  David A. van Leeuwen Speaker adaptation in the NIST speaker recognition evaluation 2004 , 2005, INTERSPEECH.

[25]  Corinne Fredouille,et al.  Technical Improvements of the E-HMM Based Speaker Diarization System for Meeting Records , 2006, MLMI.

[26]  Jean-François Bonastre,et al.  Bayesian bpproach based decision in speaker verification , 2001, Odyssey.

[27]  Gérard Chollet,et al.  The ELISA Systems for the NIST"99 Evaluation in Speaker Detection and Tracking , 1999 .

[28]  William M. Campbell,et al.  Advances in channel compensation for SVM speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[29]  Jean-François Bonastre,et al.  Application of automatic speaker recognition techniques to pathological voice assessment (dysphonia) , 2005, INTERSPEECH.

[30]  J. Picone,et al.  Speaker Verification using Support Vector Machines , 2006, Proceedings of the IEEE SoutheastCon 2006.

[31]  Jean-François Bonastre,et al.  Step-by-step and integrated approaches in broadcast news speaker diarization , 2006, Comput. Speech Lang..

[32]  Douglas E. Sturim,et al.  Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[33]  Driss Matrouf,et al.  A straightforward and efficient implementation of the factor analysis model for speaker verification , 2007, INTERSPEECH.