Comparing Template-based , Feature-based and Supervised Classification of Facial Expressions from Static Images

We compare the performance and generalization capabilities of different low-dimensional representations for facial emotion classification from static face images showing happy, angry, sad, and neutral expressions. Three general strategies are compared: The first approach uses the average face for each class as a generic template and classifies the individual facial expressions according to the best match of each template. The second strategy uses a multi-layered perceptron trained with the backpropagation of error algorithm on a subset of all facial expressions and subsequently tested on unseen face images. The third approach introduces a preprocessing step prior to the learning of an internal representation by the perceptron. The feature extraction stage computes the oriented response to six odd-symmetric and six even-symmetric Gabor-filters at each pixel position in the image. The template-based approach reached up to 75% correct classification, which corresponds to the correct recognition of three out of four expressions. However, the generalization performance only reached about 50%. The multi-layered perceptron trained on the raw face images almost always reached a classification performance of 100% on the test-set, but the generalization performance on new images varied from 40% to 80% correct recognition, depending on the choice of the test images. The introduction of the preprocessing stage was not able to improve the generalization performance but slowed down the learning by a factor of ten. We conclude, that a template-based approach for emotion classification from static images has only very limited recognition and generalization capabilities. This poor performance can be attributed to the smoothing of facial detail caused by small misalignments of the faces and the large inter-personal differences of facial expressions exposed in the data set. Although the nonlinear extraction of appropriate key features from facial expressions by the multi-layered perceptron is able to maximize classification performance, the generalization performance usually reaches only 60%. Key-Words: facial analysis, emotion recognition, static face images, MLP CSCC'99 Proc.Pages:5331-5336

[1]  John G. Daugman,et al.  Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression , 1988, IEEE Trans. Acoust. Speech Signal Process..

[2]  C. von der Malsburg,et al.  Distortion invariant object recognition by matching hierarchically labeled graphs , 1989, International 1989 Joint Conference on Neural Networks.

[3]  Garrison W. Cottrell,et al.  EMPATH: Face, Emotion, and Gender Recognition Using Holons , 1990, NIPS.

[4]  Lawrence Sirovich,et al.  Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[6]  Roberto Brunelli,et al.  Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Alice J. O'Toole,et al.  Low-dimensional representation of faces in higher dimensions of the face space , 1993 .

[8]  Irfan Essa,et al.  Tracking facial motion , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[9]  Terrence J. Sejnowski,et al.  A Perceptron Reveals the Face of Sex , 1995, Neural Computation.

[10]  Marian Stewart Bartlett,et al.  Classifying Facial Action , 1995, NIPS.

[11]  Rama Chellappa,et al.  Human and machine recognition of faces: a survey , 1995, Proc. IEEE.

[12]  D. Reisfeld,et al.  Face recognition using a hybrid supervised/unsupervised neural network , 1996, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[13]  Larry S. Davis,et al.  Recognizing Human Facial Expressions From Long Image Sequences Using Optical Flow , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  G. Cottrell,et al.  Categorical Perception in Facial Emotion Classification , 1996 .

[15]  Peter J. B. Hancock,et al.  Testing Principal Component Representations for Faces , 1997, NCPW.

[16]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[17]  James Jenn-Jier Lien,et al.  A Multi-Method Approach for Discriminating Between Similar Facial Expressions, Including Expression Intensity Estimation , 1998 .

[18]  Alex Pentland,et al.  Beyond eigenfaces: probabilistic matching for face recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[19]  Zhengyou Zhang,et al.  Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[20]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..