Histogram Equalization in SVM Multimodal Person Verification

It has been shown that prosody helps to improve voice spectrum based speaker recognition systems. Therefore, prosodic features can also be used in multimodal person verification in order to achieve better results. In this paper, a multimodal recognition system based on facial and vocal tract spectral features is improved by adding prosodic information. Matcher weighting method and support vector machines have been used as fusion techniques, and histogram equalization has been applied before SVM fusion as a normalization technique. The results show that the performance of a SVM multimodal verification system can be improved by using histogram equalization, especially when the equalization is applied to those scores giving the highest EER values.

[1]  Juergen Luettin,et al.  Evaluation Protocol for the extended M2VTS Database (XM2VTSDB) , 1998 .

[2]  U. M. Feyyad Data mining and knowledge discovery: making sense out of data , 1996 .

[3]  Tieniu Tan,et al.  Combining Fingerprint and Voiceprint Biometrics for Identity Verification: an Experimental Comparison , 2004, ICBA.

[4]  Hermann Ney,et al.  Quantile based histogram equalization for noise robust speech recognition , 2001, INTERSPEECH.

[5]  Alfred C. Weaver,et al.  Biometric authentication , 2006, Computer.

[6]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[7]  Sharath Pankanti,et al.  Guide to Biometrics , 2003, Springer Professional Computing.

[8]  Ralph Arnote,et al.  Hong Kong (China) , 1996, OECD/G20 Base Erosion and Profit Shifting Project.

[9]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[10]  David J. Kriegman,et al.  Recognition using class specific linear projection , 1997 .

[11]  Climent Nadeu,et al.  On the decorrelation of filter-bank energies in speech recognition , 1995, EUROSPEECH.

[12]  Tsuhan Chen,et al.  Improved Audio-Visual Speaker Recognition via the Use of a Hybrid Combination Strategy , 2003, AVBPA.

[13]  I. Pitas,et al.  Discriminant NMFfaces for Frontal Face Verification , 2005, 2005 IEEE Workshop on Machine Learning for Signal Processing.

[14]  Alan Mink,et al.  Multimodal Biometric Authentication Methods: A COTS Approach | NIST , 2003 .

[15]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[16]  Ralph Gross,et al.  Person identification using automatic integration of speech, lip, and face experts , 2003, WBMA '03.

[17]  Jordi Luque,et al.  On the fusion of prosody, voice spectrum and face features for multimodal person verification , 2006, INTERSPEECH.

[18]  Douglas A. Reynolds,et al.  Fusing high- and low-level features for speaker recognition , 2003, INTERSPEECH.

[19]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[20]  Richard J. Mammone,et al.  Non-parametric estimation and correction of non-linear distortion in speech systems , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[21]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[22]  U. Uludag,et al.  Multimodal Biometric Authentication Methods : A COTS Approach , 2003 .

[23]  Douglas A. Reynolds,et al.  Using prosodic and conversational features for high-performance speaker recognition: report from JHU WS'02 , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[24]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[25]  Daniel J. Mashao,et al.  Modified Segmental Histogram Equalization for robust speaker verification , 2006, Pattern Recognit. Lett..

[26]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[27]  Rao Yarlagadda,et al.  Features and measures for speaker recognition , 1992 .

[28]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.