Facial event mining using coupled hidden Markov models

Facial event mining is one of the key techniques for automatic human face analysis. It plays an important role in human computer interaction. This paper proposes a new approach to facial event recognition by combining active shape models (ASMs) and coupled hidden Markov models (CHMMs). Based on the assumption that a complex facial event can be decomposed into multiple coupled processes, ASMs are used to track global facial features and to decouple pattern attributes for upper and lower faces separately. These two interacting processes are modeled as a CHMM for training and recognition. Four basic facial events are investigated. Preliminary experiments yield consistent results that show the significant advantage of CHMMs over conventional HMMs for facial event mining in video.

[1]  Takeo Kanade,et al.  Recognizing lower face action units for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[2]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[3]  T. S. Huang,et al.  Exploring the nature and variants of relevance feedback , 2001, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL 2001).

[4]  J. Gower Generalized procrustes analysis , 1975 .

[5]  Chung-Lin Huang,et al.  Facial Expression Recognition Using Model-Based Feature Extraction and Action Parameters Classification , 1997, J. Vis. Commun. Image Represent..

[6]  Guillermo Sapiro,et al.  Morphing Active Contours , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Jun Ohya,et al.  Spotting segments displaying facial expression from image sequences using HMM , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[8]  Thomas S. Huang,et al.  Optimizing learning in image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[9]  Timothy F. Cootes,et al.  Interpreting face images using active appearance models , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[10]  Jerry L. Prince,et al.  Snakes, shapes, and gradient vector flow , 1998, IEEE Trans. Image Process..

[11]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[12]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[14]  Mehmet Celenk,et al.  Spatio-temporal modeling of facial expressions using Gabor-wavelets and hierarchical hidden Markov models , 2005, IEEE International Conference on Image Processing 2005.

[15]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[16]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Gerhard Rigoll,et al.  Facial expression recognition using pseudo 3-D hidden Markov models , 2002, Object recognition supported by user interaction for service robots.

[18]  Qi Tian,et al.  Discriminant-EM algorithm with application to image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[19]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[20]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[21]  Jorge Herbert de Lira,et al.  Two-Dimensional Signal and Image Processing , 1989 .

[22]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Andrew L. Rukhin,et al.  Tools for statistical inference , 1991 .

[24]  Nuno Vasconcelos,et al.  A probabilistic architecture for content-based image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[25]  L. R. Rabiner,et al.  Recognition of isolated digits using hidden Markov models with continuous mixture densities , 1985, AT&T Technical Journal.

[26]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[27]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[28]  Thomas S. Huang,et al.  Small sample learning during multimedia retrieval using BiasMap , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[29]  Rachid Deriche,et al.  Geodesic Active Contours and Level Sets for the Detection and Tracking of Moving Objects , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Matthew Brand,et al.  Coupled hidden Markov models for modeling interacting processes , 1997 .

[31]  Michael Isard,et al.  Active Contours: The Application of Techniques from Graphics, Vision, Control Theory and Statistics to Visual Tracking of Shapes in Motion , 2000 .

[32]  Michael I. Jordan,et al.  Mixed Memory Markov Models: Decomposing Complex Stochastic Processes as Mixtures of Simpler Ones , 1999, Machine Learning.

[33]  Gwen Littlewort,et al.  Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction. , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[34]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[35]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Kentaro Toyama,et al.  Wallflower: principles and practice of background maintenance , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[37]  Baba C. Vemuri,et al.  Shape Modeling with Front Propagation: A Level Set Approach , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[39]  Yoram Singer,et al.  The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.

[40]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Nicu Sebe,et al.  Facial expression recognition from video sequences: temporal and static modeling , 2003, Comput. Vis. Image Underst..

[42]  Mubarak Shah,et al.  Motion-Based Recognition , 1997, Computational Imaging and Vision.

[43]  Ramakant Nevatia,et al.  Segmentation and tracking of multiple humans in complex situations , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[44]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[45]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[47]  Ulrich Eckhardt,et al.  Shape descriptors for non-rigid shapes with a single closed contour , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[48]  Zoubin Ghahramani,et al.  An Introduction to Hidden Markov Models and Bayesian Networks , 2001, Int. J. Pattern Recognit. Artif. Intell..

[49]  Mehmet Celenk,et al.  Color and texture priors in active contours for model-based image segmentation , 2003, 3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the.

[50]  Daniel P. Huttenlocher,et al.  Adaptive Bayesian recognition in tracking rigid objects , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[51]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[52]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[53]  Alex Pentland,et al.  Coding, Analysis, Interpretation, and Recognition of Facial Expressions , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Takeo Kanade,et al.  A System for Video Surveillance and Monitoring , 2000 .

[55]  Paul A. Viola,et al.  Boosting Image Retrieval , 2004, International Journal of Computer Vision.

[56]  Takeo Kanade,et al.  Automated facial expression recognition based on FACS action units , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[57]  P PentlandAlex,et al.  Coding, Analysis, Interpretation, and Recognition of Facial Expressions , 1997 .

[58]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[59]  Takashi Matsuyama,et al.  Multiobject Behavior Recognition by Event Driven Selective Attention Method , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  Qiang Zhou,et al.  Adaptive object detection and recognition based on a feedback strategy , 2006, Image Vis. Comput..

[62]  Alex Pentland,et al.  LAFTER: a real-time face and lips tracker with facial expression recognition , 2000, Pattern Recognit..

[63]  Ramakant Nevatia,et al.  Video-based event recognition: activity representation and probabilistic recognition methods , 2004, Comput. Vis. Image Underst..

[64]  Oscar E. Agazzi,et al.  Keyword Spotting in Poorly Printed Documents using Pseudo 2-D Hidden Markov Models , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Monson H. Hayes,et al.  Maximum likelihood training of the embedded HMM for face detection and recognition , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[66]  Ramesh C. Jain,et al.  Using Dynamic Programming for Solving Variational Problems in Vision , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[67]  Shaogang Gong,et al.  Recognition of group activities using dynamic probabilistic networks , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[68]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[69]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[70]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[71]  Christoph Bregler,et al.  Learning and recognizing human dynamics in video sequences , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[72]  Timothy F. Cootes,et al.  Statistical models of appearance for computer vision , 1999 .

[73]  R. Brunelli,et al.  A Survey on the Automatic Indexing of Video Data, , 1999, J. Vis. Commun. Image Represent..

[74]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[75]  Aaron F. Bobick,et al.  Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[76]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[77]  Beat Fasel,et al.  Automati Fa ial Expression Analysis: A Survey , 1999 .

[78]  Kevin P. Murphy,et al.  A coupled HMM for audio-visual speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[79]  Qiang Zhou Generalized Landmark Recognition in Robot Navigation , 2004 .

[80]  Monson H. Hayes,et al.  Face Recognition Using An Embedded HMM , 1999 .

[81]  Qiang Zhou,et al.  Shape-based image retrieval with relevance feedback , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[82]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[83]  Qiang Ji,et al.  Facial expression understanding in image sequences using dynamic and active visual information fusion , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[84]  J. Baker,et al.  The DRAGON system--An overview , 1975 .

[85]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[86]  Michael J. Lyons,et al.  Classifying facial attributes using a 2-D Gabor wavelet representation and discriminant analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).