Audio-Visual Recognition System with Intra-Modal Fusion

In this paper, a new multimodal biometric recognition system based on feature fusion is proposed to increase the robustness and circumvention of conventional multimodal recognition system. The feature sets originating from the output of the visual and audio feature extraction systems are fused and being classified by RBF neural network. Other than that, 2DPCA is proposed to work in conjunction with LDA to further increase the recognition performance of the visual recognition system. The experimental result shows that the proposed system achieves a higher recognition rate as compared to the conventional multimodal recognition system. Besides, we also show that the 2DPCA+LDA achieves a higher recognition rate as compared with PCA, PCA+LDA and 2DPCA.

[1]  E. Catmull,et al.  Recursively generated B-spline surfaces on arbitrary topological meshes , 1978 .

[2]  David Zhang,et al.  Online Palmprint Identification , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jian Yang,et al.  Regularization of LDA for Face Recognition: A Post-processing Approach , 2005, AMFG.

[4]  Hua Yu,et al.  A direct LDA algorithm for high-dimensional data - with application to face recognition , 2001, Pattern Recognit..

[5]  David Zhang,et al.  Fisherpalms based palmprint recognition , 2003, Pattern Recognit. Lett..

[6]  Yan Zhang,et al.  On the Euclidean distance of images , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Sun Weihe Key Technologies of Automatic Quadrilateral Finite Element Mesh Qenerator , 2004 .

[8]  Hyeonjoon Moon,et al.  The FERET Evaluation Methodology for Face-Recognition Algorithms , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Jian Yang,et al.  KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  J.N. Gowdy,et al.  CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Alejandro F. Frangi,et al.  Two-dimensional PCA: a new approach to appearance-based face representation and recognition , 2004 .

[12]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[13]  Saifur Rahman,et al.  SPEAKER IDENTIFICATION USING MEL FREQUENCY CEPSTRAL COEFFICIENTS , 2004 .

[14]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[15]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  David Zhang,et al.  Post-processing on LDA's Discriminant Vectors for Facial Feature Extraction , 2005, AVBPA.

[17]  Ching Y. Suen,et al.  Optimal combinations of pattern classifiers , 1995, Pattern Recognit. Lett..

[18]  Jian Yang,et al.  A generalised K-L expansion method which can deal with small sample size and high-dimensional problems , 2003, Pattern Analysis & Applications.

[19]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[20]  Hermann Ney,et al.  Computing Mel-frequency cepstral coefficients on the power spectrum , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[21]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[22]  Thomas Wagner,et al.  SESAM: A biometric person identification system using sensor fusion , 1997, Pattern Recognit. Lett..