Automatic Engagement Prediction with GAP Feature

In this paper, we propose an automatic engagement prediction method for the Engagement in the Wild sub-challenge of EmotiW 2018. We first design a novel Gaze-AU-Pose (GAP) feature taking into account the information of gaze, action units and head pose of a subject. The GAP feature is then used for the subsequent engagement level prediction. To efficiently predict the engagement level for a long-time video, we divide the long-time video into multiple overlapped video clips and extract GAP feature for each clip. A deep model consisting of a Gated Recurrent Unit (GRU) layer and a fully connected layer is used as the engagement predictor. Finally, a mean pooling layer is applied to the per-clip estimation to get the final engagement level of the whole video. Experimental results on the validation set and test set show the effectiveness of the proposed approach. In particular, our approach achieves a promising result with an MSE of 0.0724 on the test set of Engagement Prediction Challenge of EmotiW 2018.t with an MSE of 0.072391 on the test set of Engagement Prediction Challenge of EmotiW 2018.

[1]  Wenpeng Yin,et al.  Comparative Study of CNN and RNN for Natural Language Processing , 2017, ArXiv.

[2]  Shiguang Shan,et al.  Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Tamás D. Gedeon,et al.  EmotiW 2018: Audio-Video, Student Engagement and Group-Level Affect Prediction , 2018, ICMI.

[5]  K D'MelloSidney,et al.  Multimodal semi-automated affect detection from conversational cues, gross body language, and facial features , 2010 .

[6]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Mohan M. Trivedi,et al.  Looking-in and looking-out vision for Urban Intelligent Assistance: Estimation of driver attentive state and dynamic surround for safe merging and braking , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[8]  Louis-Philippe Morency,et al.  OpenFace 2.0: Facial Behavior Analysis Toolkit , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[9]  Javier R. Movellan,et al.  The Faces of Engagement: Automatic Recognition of Student Engagementfrom Facial Expressions , 2014, IEEE Transactions on Affective Computing.

[10]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[11]  Vincent Aleven,et al.  Intelligent Tutoring Goes To School in the Big City , 1997 .

[12]  Sergio Escalera,et al.  Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[14]  Bert Bredeweg,et al.  Proceedings of the 2005 conference on Artificial Intelligence in Education: Supporting Learning through Intelligent and Socially Informed Technology , 2005 .

[15]  Joseph E. Beck,et al.  Engagement tracing: using response times to model student disengagement , 2005, AIED.

[16]  Shiguang Shan,et al.  SynRhythm: Learning a Deep Heart Rate Estimator from General to Specific , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[17]  Abhinav Dhall,et al.  Prediction and Localization of Student Engagement in the Wild , 2018, 2018 Digital Image Computing: Techniques and Applications (DICTA).

[18]  Shiguang Shan,et al.  Deep Multi-Task Learning for Joint Prediction of Heterogeneous Face Attributes , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[19]  Anil K. Jain,et al.  3D face texture modeling from uncalibrated frontal and profile images , 2012, 2012 IEEE Fifth International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[20]  Rafael A. Calvo,et al.  Automated Detection of Engagement Using Video-Based Estimation of Facial Expressions and Heart Rate , 2017, IEEE Transactions on Affective Computing.