Fusion Technique for User Identification Using Camera and Microphone in the Intelligent Service Robots

In this paper, the user identification based on face and speaker information obtained from camera and microphone for the intelligent service robot is proposed. For this purpose, we use fisherface method for face recognition. The choice of the fisherface method in this setting is motivated by its insensitivity to large variation in light direction, face pose, and facial expression. Furthermore, we utilize Gaussian Mixture Model (GMM) classifier which uses a Mel-Frequency Cepstral Coefficients (MFCC) as feature vector for speaker recognition. The weighted sum method is used to fuse cosine similarity and log- likelihood produced from fisherface and GMM classifier, respectively. The experimental results reveal that the presented fusion method showed a better performance than fisherface and GMM classifier itself through the research robot platform called WEVER developed in ETRI.

[1]  Bayya Yegnanarayana,et al.  Combining evidence from residual phase and MFCC features for speaker recognition , 2006, IEEE Signal Processing Letters.

[2]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[3]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[4]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Konstantinos N. Plataniotis,et al.  Face recognition using LDA-based algorithms , 2003, IEEE Trans. Neural Networks.