An Advanced Learning Environment Aided by Recognition of Multi-modal Social Signals

In this study, we established an advanced learning environment that aims to promote learning of social communication skills for children especially those with Autism Spectrum Disorder. The learning environment estimates in real time the affective & cognitive state of a child via recognition of multi-modal social signals, and generates interactive narratives with embodied virtual characters. One of the key components of the environment is the Visual Inputs Processor, which is the first that has the capabilities of detecting a child’s attention and expression simultaneously in a natural environment. Furthermore, those have been enabled in a nonintrusive manner which avoids potential bias to the user’s behaviours introduced by intrusive counterparts. The environment also employs multiple inexpensive cameras and a large multi-touch screen, and these settings (1) maximize of space and angles in which observation may be performed and (2) provide users with a much more realistic experience than counterpart approaches do.

[1]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[2]  Stefan Kohlbecher,et al.  Gaze-based interaction in various environments , 2008, VNBA '08.

[3]  N. Mercer,et al.  Dialogue and the Development of Children's Thinking: A Sociocultural Approach , 2007 .

[4]  Irfan Essa,et al.  Fast Multiple Camera Head Pose Tracking , 2003 .

[5]  Rita Jordan,et al.  Meeting the needs of children with autistic spectrum disorders , 1999 .

[6]  Kostas Karpouzis,et al.  Investigating shared attention with a virtual agent using a gaze-based interface , 2010, Journal on Multimodal User Interfaces.

[7]  Zhiwei Zhu,et al.  Non-intrusive Eye and Gaze Tracking for Natural Human Computer Interaction , 2003, MMI Interakt..

[8]  Thierry Dutoit,et al.  MULTIMODAL FEEDBACK FROM ROBOTS AND AGENTS IN A STORYTELLING EXPERIMENT , 2008 .

[9]  J. Piaget Play, dreams and imitation in childhood , 1951 .

[10]  Larry S. Davis,et al.  Model-based object pose in 25 lines of code , 1992, International Journal of Computer Vision.

[11]  Antonio García Dopico,et al.  A Precise Eye-Gaze Detection and Tracking System , 2003, WSCG.

[12]  Christopher Frauenberger,et al.  Supporting children's social communication skills through interactive narratives with virtual characters , 2010, ACM Multimedia.

[13]  Alexander Zelinsky,et al.  An algorithm for real-time stereo vision implementation of head pose and gaze direction measurement , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[14]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.