Associating Facial Expressions and Upper-Body Gestures with Learning Tasks for Enhancing Intelligent Tutoring Systems

Learning involves a substantial amount of cognitive, social and emotional states. Therefore, recognizing and understanding these states in the context of learning is key in designing informed interventions and addressing the needs of the individual student to provide personalized education. In this paper, we explore the automatic detection of learner’s nonverbal behaviors involving hand-over-face gestures, head and eye movements and emotions via facial expressions during learning. The proposed computer vision-based behavior monitoring method uses a low-cost webcam and can easily be integrated with modern tutoring technologies. We investigate these behaviors in-depth over time in a classroom session of 40 minutes involving reading and problem-solving exercises. The exercises in the sessions are divided into three categories: an easy, medium and difficult topic within the context of undergraduate computer science. We found that there is a significant increase in head and eye movements as time progresses, as well as with the increase of difficulty level. We demonstrated that there is a considerable occurrence of hand-over-face gestures (on average 21.35%) during the 40 minutes session and is unexplored in the education domain. We propose a novel deep learning approach for automatic detection of hand-over-face gestures in images with a classification accuracy of 86.87%. There is a prominent increase in hand-over-face gestures when the difficulty level of the given exercise increases. The hand-over-face gestures occur more frequently during problem-solving (easy 23.79%, medium 19.84% and difficult 30.46%) exercises in comparison to reading (easy 16.20%, medium 20.06% and difficult 20.18%).

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  A. Pease,et al.  The Definitive Book of Body Language , 2004 .

[3]  Rosalind W. Picard,et al.  Automated Posture Analysis for Detecting Learner's Interest Level , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[4]  Kostas Karpouzis,et al.  Investigating shared attention with a virtual agent using a gaze-based interface , 2010, Journal on Multimodal User Interfaces.

[5]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[6]  Kasia Muldner,et al.  Emotion Sensors Go To School , 2009, AIED.

[7]  Peter Robinson,et al.  Automatic Analysis of Naturalistic Hand-Over-Face Gestures , 2016, TIIS.

[8]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[9]  Charles M. Reigeluth,et al.  Education 3.0: breaking the mold with technology , 2015, Interact. Learn. Environ..

[10]  Dacheng Tao,et al.  A Comprehensive Survey on Pose-Invariant Face Recognition , 2015, ACM Trans. Intell. Syst. Technol..

[11]  N. Ambady,et al.  Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. , 1992 .

[12]  Michelle Taub,et al.  Toward affect-sensitive virtual human tutors: The influence of facial expressions on learning and emotion , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).

[13]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[14]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Brad Shuck,et al.  Emotions and Their Effect on Adult Learning: A Constructivist Perspective , 2013 .

[16]  S. Christenson,et al.  Check & Connect: The importance of relationships for promoting engagement with school , 2004 .

[17]  Ryan Shaun Joazeiro de Baker,et al.  Automatic Detection of Learning-Centered Affective States in the Wild , 2015, IUI.

[18]  Beverly Park Woolf,et al.  Building Intelligent Interactive Tutors: Student-centered Strategies for Revolutionizing E-learning , 2008 .

[19]  H. Wallbott Bodily expression of emotion , 1998 .

[20]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[21]  Javier R. Movellan,et al.  The Faces of Engagement: Automatic Recognition of Student Engagementfrom Facial Expressions , 2014, IEEE Transactions on Affective Computing.

[22]  R. Livingstone The future in education , 1941 .

[23]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Scotty D. Craig,et al.  Integrating Affect Sensors in an Intelligent Tutoring System , 2004 .

[25]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[26]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[27]  Jeffrey R. Spies,et al.  Something in the way we move: Motion dynamics, not perceived sex, influence head movements in conversation. , 2011, Journal of experimental psychology. Human perception and performance.

[28]  Rafael A. Calvo,et al.  Affect Detection: An Interdisciplinary Review of Models, Methods, and Their Applications , 2010, IEEE Transactions on Affective Computing.

[29]  B. Gelder,et al.  Why bodies? Twelve reasons for including bodily expressions in affective neuroscience , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[30]  D. McNeill Hand and Mind: What Gestures Reveal about Thought , 1992 .

[31]  Cynthia Barb,et al.  Strange Curves, Counting Rabbits, and Other Mathematical Explorations , 2004 .

[32]  J. Dirkx Engaging emotions in adult learning: A jungian perspective on emotion and transformative learning , 2006 .

[33]  Beat Fasel,et al.  Automatic facial expression analysis: a survey , 2003, Pattern Recognit..

[34]  B. Gelder Towards the neurobiology of emotional body language , 2006, Nature Reviews Neuroscience.

[35]  Diane J. Litman,et al.  Predicting Student Emotions in Computer-Human Tutoring Dialogues , 2004, ACL.

[36]  Beverly Park Woolf,et al.  A Dynamic Mixture Model to Detect Student Motivation and Proficiency , 2006, AAAI.

[37]  Anupam Agrawal,et al.  Vision based hand gesture recognition for human computer interaction: a survey , 2012, Artificial Intelligence Review.

[38]  Debra K. Meyer,et al.  Discovering Emotion in Classroom Motivation Research , 2002 .

[39]  Cynthia Breazeal,et al.  Affective Personalization of a Social Robot Tutor for Children's Second Language Skills , 2016, AAAI.

[40]  Matthew L. Jensen,et al.  Deception detection through automatic, unobtrusive analysis of nonverbal behavior , 2005, IEEE Intelligent Systems.

[41]  C. Darwin The Expression of the Emotions in Man and Animals , .

[42]  Donald A. Norman,et al.  Twelve Issues for Cognitive Science , 1980, Cogn. Sci..

[43]  Jing Xiao,et al.  Multimodal coordination of facial action, head rotation, and eye motion during spontaneous smiles , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[44]  Ma. Mercedes T. Rodrigo,et al.  Modeling the affective states of students using an intelligent tutoring system for algebra , 2012 .

[45]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[46]  Peter Robinson,et al.  Interpreting Hand-Over-Face Gestures , 2011, ACII.

[47]  S. D’Mello A selective meta-analysis on the relative incidence of discrete affective states during learning with technology , 2013 .

[48]  S. Goldin-Meadow,et al.  The Role of Gesture in Learning: Do Children Use Their Hands to Change Their Minds? , 2006 .

[49]  Arthur C. Graesser,et al.  Toward an Affect-Sensitive AutoTutor , 2007, IEEE Intelligent Systems.

[50]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  Yueh-Min Huang,et al.  What is Affective Learning? , 2013, EMC/HumanCom.

[52]  Arthur GRAESSER,et al.  Exploring Relationships Between Affect and Learning with AutoTutor , 2007 .

[53]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[54]  Ahmad Y. Javaid,et al.  Facial Emotion Recognition: A Survey and Real-World User Experiences in Mixed Reality , 2018, Sensors.

[55]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[56]  Zhigang Deng,et al.  Rigid Head Motion in Expressive Speech Animation: Analysis and Synthesis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[57]  Krystle Valerio,et al.  Intrinsic motivation in the classroom , 2012 .

[58]  Jennifer A. Fredricks,et al.  School Engagement: Potential of the Concept, State of the Evidence , 2004 .

[59]  Ryan Shaun Joazeiro de Baker,et al.  Classroom activities and off-task behavior in elementary school children , 2013, CogSci.

[60]  Daniel S. Hirschberg,et al.  Algorithms for the Longest Common Subsequence Problem , 1977, JACM.

[61]  Sanjeev Sofat,et al.  Vision Based Hand Gesture Recognition , 2009 .

[62]  Wolff‐Michael Roth Gestures: Their Role in Teaching and Learning , 2001 .

[63]  Ann L. Brown,et al.  How people learn: Brain, mind, experience, and school. , 1999 .

[64]  B. de Gelder Why bodies? Twelve reasons for including bodily expressions in affective neuroscience. , 2009, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[65]  Michelle Karg,et al.  Body Movements for Affective Expression: A Survey of Automatic Recognition and Generation , 2013, IEEE Transactions on Affective Computing.

[66]  Barbara Tversky,et al.  Gestures for Thinking and Explaining , 2005 .

[67]  P SumathiC.,et al.  Automatic Facial Expression Analysis A Survey , 2012 .

[68]  Beatrice de Gelder,et al.  Towards the neurobiology of emotional body language , 2006, Nature reviews. Neuroscience.

[69]  Nigel Bosch,et al.  Co-occurring Affective States in Automated Computer Programming Education , 2014 .

[70]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Winslow Burleson,et al.  Affective agents: Sustaining motivation to learn through failure and state of "stuck" , 2004 .

[72]  BenittiFabiane Barreto Vavassori Exploring the educational potential of robotics in schools , 2012 .

[73]  Beverly Park Woolf,et al.  Repairing Disengagement With Non-Invasive Interventions , 2007, AIED.

[74]  Lei Zhang,et al.  Online Learners’ Reading Ability Detection Based on Eye-Tracking Sensors , 2016, Sensors.

[75]  Andrew Olney,et al.  Gaze tutor: A gaze-reactive intelligent tutoring system , 2012, Int. J. Hum. Comput. Stud..

[76]  M. Lepper,et al.  Motivational techniques of expert human tutors: Lessons for the design of computer-based tutors. , 1993 .

[77]  Peter Robinson,et al.  Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[78]  时慧 Intrinsic Motivation in the Classroom , 2011 .

[79]  Panayiotis G. Georgiou,et al.  Data driven modeling of head motion towards analysis of behaviors in couple interactions , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[80]  Helen F. Hastie,et al.  Empathic Robotic Tutors for Personalised Learning: A Multidisciplinary Approach , 2015, ICSR.

[81]  M. Cavanagh Students’ experiences of active engagement through cooperative learning activities in lectures , 2011 .

[82]  Kiavash Bahreini,et al.  Towards multimodal emotion recognition in e-learning environments , 2016, Interact. Learn. Environ..

[83]  Diane J. Litman,et al.  Adapting to Multiple Affective States in Spoken Dialogue , 2012, SIGDIAL Conference.

[84]  Cristina Conati,et al.  Understanding Attention to Adaptive Hints in Educational Games: An Eye-Tracking Study , 2013, International Journal of Artificial Intelligence in Education.

[85]  Fabiane Barreto Vavassori Benitti,et al.  Exploring the educational potential of robotics in schools: A systematic review , 2012, Comput. Educ..

[86]  Beverly Park Woolf,et al.  Affect-aware tutors: recognising and responding to student affect , 2009, Int. J. Learn. Technol..

[87]  Moffat Mathews,et al.  Do Your Eyes Give It Away? Using Eye Tracking Data to Understand Students' Attitudes towards Open Student Model Representations , 2012, ITS.

[88]  D. Keltner Signs of appeasement: evidence for the distinct displays of embarrassment, amusement, and shame , 1995 .

[89]  P. Ekman An argument for basic emotions , 1992 .

[90]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[91]  J. Reeve,et al.  Understanding motivation and emotion , 1991 .

[92]  Ashish Kapoor,et al.  Automatic prediction of frustration , 2007, Int. J. Hum. Comput. Stud..

[93]  Rama Chellappa,et al.  Human and machine recognition of faces: a survey , 1995, Proc. IEEE.

[94]  D. Bligh What's the Use of Lectures? , 1971 .

[95]  Elsevier Sdol International Journal of Human-Computer Studies , 2009 .

[96]  Arthur C. Graesser,et al.  A Time for Emoting: When Affect-Sensitivity Is and Isn't Effective at Promoting Deep Learning , 2010, Intelligent Tutoring Systems.

[97]  Angela L. Duckworth,et al.  Advanced, Analytic, Automated (AAA) Measurement of Engagement During Learning , 2017, Educational psychologist.

[98]  Geoffrey Beattie Rethinking Body Language: How Hand Movements Reveal Hidden Thoughts , 2016 .

[99]  B. Perry,et al.  Fear and learning: Trauma-related factors in the adult education process , 2006 .

[100]  Cynthia Breazeal,et al.  Affective Learning — A Manifesto , 2004 .

[101]  Pat Wolfe,et al.  The role of meaning and emotion in learning , 2006 .