Recognizing eyebrow and periodic head gestures using CRFs for non-manual grammatical marker detection in ASL

Changes in eyebrow configuration, in combination with head gestures and other facial expressions, are used to signal essential grammatical information in signed languages. Motivated by the goal of improving the detection of non-manual grammatical markings in American Sign Language (ASL), we introduce a 2-level CRF method for recognition of the components of eyebrow and periodic head gestures, differentiating the linguistically significant domain (core) from transitional movements (which we refer to as the onset and offset). We use a robust face tracker and 3D warping to extract and combine the geometric and appearance features, as well as a feature selection method to further improve the recognition accuracy. For the second level of the CRFs, linguistic annotations were used as training for partitioning of the gestures, to separate the onset and offset. This partitioning is essential to recognition of the linguistically significant domains (in between). We then use the recognition of onset, core, and offset of these gestures together with the lower level features to detect non-manual grammatical markers in ASL.

[1]  Maja Pantic,et al.  Fully Automatic Recognition of the Temporal Phases of Facial Actions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2]  Dimitris N. Metaxas,et al.  Spatial and temporal pyramids for grammatical expression recognition of American sign language , 2009, Assets '09.

[3]  Charlotte Lee Baker-Shenk,et al.  American Sign Language : A Teacher's Resource Text on Grammar and Culture , 1991 .

[4]  Geoffrey Restall Coulter,et al.  American sign language typology , 1979 .

[5]  Changsheng Li,et al.  Learning ordinal discriminative features for age estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Dimitris N. Metaxas,et al.  Ranking Model for Facial Age Estimation , 2010, 2010 20th International Conference on Pattern Recognition.

[7]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Stan Sclaroff,et al.  Automatic detection of relevant head gestures in American Sign Language communication , 2002, Object recognition supported by user interaction for service robots.

[9]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10]  Junzhou Huang,et al.  Sparse shape composition: A new framework for shape prior modeling , 2011, CVPR 2011.

[11]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Fei Yang,et al.  Expression flow for 3D-aware face component transfer , 2011, SIGGRAPH 2011.

[13]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[14]  C. Pichler,et al.  American Sign Language: A Teacher's Resource Text on Grammar and Culture. Silver Spring, MD: T. J. Publishers. , 2013 .

[15]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[16]  Thomas Burger,et al.  Sequential Belief-Based Fusion of Manual and Non-manual Information for Recognizing Isolated Signs , 2007, Gesture Workshop.

[17]  Sudeep Sarkar,et al.  FUSION OF MANUAL AND NON-MANUAL INFORMATION IN AMERICAN SIGN LANGUAGE RECOGNITION , 2009 .

[18]  Fei Yang,et al.  Recognition of Nonmanual Markers in American Sign Language (ASL) Using Non-Parametric Adaptive 2D-3D Face Tracking , 2012, LREC.

[19]  Constantine Stephanidis,et al.  Universal access in the information society , 1999, HCI.

[20]  Siome Goldenstein,et al.  Facial movement analysis in ASL , 2007, Universal Access in the Information Society.

[21]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[22]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[23]  Beat Fasel,et al.  Automatic facial expression analysis: a survey , 2003, Pattern Recognit..

[24]  Vladimir Pavlovic,et al.  Multi-output Laplacian dynamic ordinal regression for facial expression recognition and intensity estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  C. F. Kossack,et al.  Rank Correlation Methods , 1949 .

[26]  Stan Sclaroff,et al.  Sign Language Spotting with a Threshold Model Based on Conditional Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Surendra Ranganath,et al.  Recognizing Continuous Grammatical Marker Facial Gestures in Sign Language Video , 2010, ACCV.

[28]  Maja Pantic,et al.  Automatic Analysis of Facial Expressions: The State of the Art , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Carol Neidle,et al.  The Syntax of American Sign Language: Functional Categories and Hierarchical Structure , 1999 .