Human-Centred Intelligent Human-Computer Interaction (HCI2): how far are we from attaining it?

A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. To realise this prediction, next-generation computing should develop anticipatory user interfaces that are human-centred, built for humans and based on naturally occurring multimodal human communication. These interfaces should transcend the traditional keyboard and mouse and have the capacity to understand and emulate human communicative intentions as expressed through behavioural cues, such as affective and social signals. This article discusses how far we are to the goal of human-centred computing and Human-Centred Intelligent Human-Computer Interaction (HCI2) that can understand and respond to multimodal human communication.

[1]  Jean-Marc Odobez,et al.  Tracking the Visual Focus of Attention for a Varying Number of Wandering People , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  N. Ambady,et al.  Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. , 1992 .

[3]  J. G. Taylor,et al.  Emotion recognition in human-computer interaction , 2005, Neural Networks.

[4]  Maja Pantic,et al.  Fully Automatic Facial Action Unit Detection and Temporal Analysis , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[5]  Maja Pantic,et al.  Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[6]  Simon Haykin,et al.  Special Issue on Sequential State Estimation , 2004, Proc. IEEE.

[7]  Emile H. L. Aarts Ambient intelligence drives open innovation , 2005, INTR.

[8]  Jeffrey F. Cohn,et al.  The Timing of Facial Motion in posed and Spontaneous Smiles , 2003, Int. J. Wavelets Multiresolution Inf. Process..

[9]  K. Scherer,et al.  Vocal expression of affect , 2005 .

[10]  Björn W. Schuller,et al.  Audiovisual recognition of spontaneous interest within conversations , 2007, ICMI '07.

[11]  Takeo Kanade,et al.  Facial Expression Analysis , 2011, AMFG.

[12]  Gwen Littlewort,et al.  Faces of pain: automated measurement of spontaneousallfacial expressions of genuine and posed pain , 2007, ICMI '07.

[13]  L. Rothkrantz,et al.  Toward an affect-sensitive multimodal human-computer interaction , 2003, Proc. IEEE.

[14]  Maja Pantic,et al.  Gaze-X: adaptive affective multimodal interface for single-user office scenarios , 2006, ICMI '06.

[15]  Sameer Singh,et al.  Video analysis of human dynamics - a survey , 2003, Real Time Imaging.

[16]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[17]  Thomas Erickson,et al.  The Disappearing Computer , 2007 .

[18]  J. Russell,et al.  Facial and vocal expressions of emotion. , 2003, Annual review of psychology.

[19]  Qiang Ji,et al.  Facial Action Unit Recognition by Exploiting Their Dynamic and Semantic Relationships , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  R. Gibson,et al.  What the Face Reveals , 2002 .

[21]  Ananth N. Iyer,et al.  Emotion Detection From Infant Facial Expressions And Cries , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[22]  Gwen Littlewort,et al.  Dynamics of Facial Expression Extracted Automatically from Video , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[23]  Marco Costa,et al.  Social Presence, Embarrassment, and Nonverbal Behavior , 2001 .

[24]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[25]  M. Bartlett,et al.  Machine Analysis of Facial Expressions , 2007 .

[26]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[27]  Anton Nijholt,et al.  The virtuality continuum revisited , 2005, CHI EA '05.

[28]  David A. van Leeuwen,et al.  Automatic discrimination between laughter and speech , 2007, Speech Commun..

[29]  Hitoshi Imaoka,et al.  Advances in face detection and recognition technologies , 2005 .

[30]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[31]  Mark C. Coulson Attributing Emotion to Static Body Postures: Recognition Accuracy, Confusions, and Viewpoint Dependence , 2004 .

[32]  Alex Pentland,et al.  Socially aware, computation and communication , 2005, Computer.

[33]  Harriet J. Nock,et al.  Multimodal processing by finding common cause , 2004, CACM.

[34]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  T. Chartrand,et al.  The chameleon effect: the perception-behavior link and social interaction. , 1999, Journal of personality and social psychology.

[36]  Hatice Gunes,et al.  How to distinguish posed from spontaneous smiles using geometric features , 2007, ICMI '07.

[37]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Manuele Bicego,et al.  Unsupervised scene analysis: A hidden Markov model approach , 2006, Comput. Vis. Image Underst..

[39]  K. Scherer,et al.  The New Handbook of Methods in Nonverbal Behavior Research , 2008 .

[40]  Kornel Laskowski,et al.  Emotion recognition in spontaneous speech using GMMs , 2006, INTERSPEECH.

[41]  Cynthia Whissell,et al.  THE DICTIONARY OF AFFECT IN LANGUAGE , 1989 .

[42]  Sharon Oviatt,et al.  User-centered modeling and evaluation of multimodal interfaces , 2003, Proc. IEEE.

[43]  P. Ekman Emotion in the human face , 1982 .

[44]  Yuxiao Hu,et al.  Training combination strategy of multi-stream fused hidden Markov model for audio-visual affect recognition , 2006, MM '06.

[45]  Patrick J. Flynn,et al.  A survey of approaches and challenges in 3D and multi-modal 3D + 2D face recognition , 2006, Comput. Vis. Image Underst..

[46]  Arun Ross,et al.  Multimodal biometrics: An overview , 2004, 2004 12th European Signal Processing Conference.

[47]  Alex Pentland,et al.  Human computing and machine understanding of human behavior: a survey , 2006, ICMI '06.

[48]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  L. F. Barrett,et al.  Handbook of Emotions , 1993 .

[50]  J. Russell,et al.  The psychology of facial expression: Frontmatter , 1997 .

[51]  Luc Van Gool,et al.  Smart particle filtering for high-dimensional tracking , 2007, Comput. Vis. Image Underst..

[52]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[53]  Maja Pantic,et al.  Audiovisual discrimination between laughter and speech , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[54]  Ahmed M. Elgammal,et al.  Nonlinear manifold learning for dynamic shape and dynamic appearance , 2007, Comput. Vis. Image Underst..