Facial Analysis from Continuous Video with Applications to Human-Computer Interface

This thesis is about computer vision algorithms for the analysis of video data involving faces. This kind of video, obtained for example from a camera aimed to the user of some interactive system, is potentially useful to enhance the interface between users and machines. These image sequences provide information from which machines can identify and keep track of their users, recognize their facial expressions and gestures, and complement other forms of human-computer interfaces. First, we present a learning technique based on information-theoretic discrimination which is used to construct face and facial feature detectors. Next, we describe a real-time system for face and facial feature detection and tracking in continuous video. Last, we present a probabilistic framework for embedded face and facial expression recognition from image sequences. The aforementioned learning technique, referred to in this thesis as information-based maximum discrimination, uses the information-theoretic divergence as the optimization criterion to maximize the discrimination between two classes of objects. Then, the likelihood functions obtained with this learning technique are used for object detection as in maximum likelihood classification between two classes of objects, i.e., faces and background. Using discrete data and probability models, the learning procedure and object classification algorithm are very efficiently implemented. This has allowed us to develop a real-time system capable of detecting and tracking multiple faces in complex backgrounds and nine facial features. The algorithm described in this thesis for embedded face and facial expression recognition is based on a novel probabilistic framework. In this novel framework, faces are modeled not only by their appearance, but also by the spatio-temporal deformation pattern of their expressions. Face recognition and facial expression recognition are carried out in a maximum likelihood setup. Given an image sequence, the algorithm finds the person's model and the facial expression that maximizes the likelihood probability of the observed images. In this framework, facial appearance matching is enhanced by facial expression modeling. Also, changes in facial features due to expressions are used together with facial deformation patterns to perform expression recognition.

[1]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[2]  Thomas S. Huang,et al.  Human face detection in a complex background , 1994, Pattern Recognit..

[3]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Jun Ohya,et al.  Recognizing multiple persons' facial expressions using HMM based on automatic extraction of significant frames from image sequences , 1997, Proceedings of International Conference on Image Processing.

[5]  Thomas S. Huang,et al.  Connected vibrations: a modal analysis approach for non-rigid motion tracking , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[6]  A. Ardeshir Goshtasby,et al.  Detecting human faces in color images , 1998, Image Vis. Comput..

[7]  Michael J. Black,et al.  Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[8]  Ioannis Pitas,et al.  Segmentation and tracking of faces in color images , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[9]  Irfan Essa,et al.  Analysis, interpretation and synthesis of facial expressions , 1995 .

[10]  Larry S. Davis,et al.  Recognizing Human Facial Expressions From Long Image Sequences Using Optical Flow , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Larry S. Davis,et al.  Computing spatio-temporal representations of human faces , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Jun Ohya,et al.  Recognizing abruptly changing facial expressions from time-sequential face images , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[14]  Rama Chellappa,et al.  Human and machine recognition of faces: a survey , 1995, Proc. IEEE.

[15]  Thomas S. Huang,et al.  Face detection with information-based maximum discrimination , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Roberto Cipolla,et al.  Feature-based human face detection , 1997, Image Vis. Comput..

[17]  Anastasios Tefas,et al.  Variants of dynamic link architecture based on mathematical morphology for frontal face authentication , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[18]  R. Gray Entropy and Information Theory , 1990, Springer New York.

[19]  Thomas S. Huang,et al.  Pattern detection with information-based maximum discrimination and error bootstrapping , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[20]  Y.-L. Chow Maximum mutual information estimation of HMM parameters for continuous speech recognition using the N-best algorithm , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[21]  P. Ekman Unmasking The Face , 1975 .

[22]  Gerasimos Potamianos,et al.  Discriminative training of HMM stream exponents for audio-visual speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[23]  Norbert Krüger,et al.  Face Recognition by Elastic Bunch Graph Matching , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  A. Young,et al.  Handbook of Research on Face Processing , 1989 .

[25]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Thomas S. Huang,et al.  3D-model-based head tracking , 1997, Electronic Imaging.

[27]  Alex Pentland,et al.  Real-time self-calibrating stereo person tracking using 3-D shape estimation from blob features , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[28]  Pertti Roivainen,et al.  3-D Motion Estimation in Model-Based Facial Image Coding , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Chil-Woo Lee,et al.  Automatic recognition of human facial expressions , 1995, Proceedings of IEEE International Conference on Computer Vision.

[30]  Norbert Krüger,et al.  Face recognition by elastic bunch graph matching , 1997, Proceedings of International Conference on Image Processing.

[31]  Alex Pentland,et al.  Motion regularization for model-based head tracking , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[32]  Takeo Kanade,et al.  Rotation invariant neural network-based face detection , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[33]  Demetri Terzopoulos,et al.  Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  H. K. Kesavan,et al.  The generalized maximum entropy principle , 1989, IEEE Trans. Syst. Man Cybern..

[35]  Irfan Essa,et al.  Tracking facial motion , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[36]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[37]  Timothy F. Cootes,et al.  A unified approach to coding and interpreting face images , 1995, Proceedings of IEEE International Conference on Computer Vision.

[38]  Alex Pentland,et al.  Facial expression recognition using a dynamic model and motion energy , 1995, Proceedings of IEEE International Conference on Computer Vision.

[39]  Hyeonjoon Moon,et al.  The FERET evaluation methodology for face-recognition algorithms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  A. Murat Tekalp,et al.  3-D motion estimation and wireframe adaptation including photometric effects for model-based coding of facial image sequences , 1994, IEEE Trans. Circuits Syst. Video Technol..

[41]  M. Ibrahim Sezan,et al.  A robust real-time face tracking algorithm , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[42]  PoggioTomaso,et al.  Example-Based Learning for View-Based Human Face Detection , 1998 .

[43]  Takeo Kanade,et al.  Probabilistic modeling of local appearance and spatial relationships for object recognition , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[44]  D Terzopoulos,et al.  The computer synthesis of expressive faces. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[45]  Hong Yan,et al.  An Analytic-to-Holistic Approach for Face Recognition Based on a Single Frontal View , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  J. N. Bassili Emotion recognition: the role of facial movement and the relative importance of upper and lower areas of the face. , 1979, Journal of personality and social psychology.