Toward an affect-sensitive multimodal human-computer interaction

The ability to recognize affective states of a person we are communicating with is the core of emotional intelligence. Emotional intelligence is a facet of human intelligence that has been argued to be indispensable and perhaps the most important for successful interpersonal social interaction. This paper argues that next-generation human-computer interaction (HCI) designs need to include the essence of emotional intelligence - the ability to recognize a user's affective states-in order to become more human-like, more effective, and more efficient. Affective arousal modulates all nonverbal communicative cues (facial expressions, body movements, and vocal and physiological reactions). In a face-to-face interaction, humans detect and interpret those interactive signals of their communicator with little or no effort. Yet design and development of an automated system that accomplishes these tasks is rather difficult. This paper surveys the past work in solving these problems by a computer and provides a set of recommendations for developing the first part of an intelligent multimodal HCI-an automatic personalized analyzer of a user's nonverbal affective feedback.

[1]  C. Darwin The Expression of the Emotions in Man and Animals , .

[2]  I. Fónagy,et al.  Emotional Patterns in Intonation and Music , 1963 .

[3]  J. Davitz,et al.  The communication of emotional meaning , 1964 .

[4]  D. G. Green,et al.  Optical and retinal factors affecting visual resolution. , 1965, The Journal of physiology.

[5]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[6]  K. Stevens,et al.  Emotions and speech: some acoustical correlates. , 1972, The Journal of the Acoustical Society of America.

[7]  P. Ekman Universals and cultural differences in facial expressions of emotion. , 1972 .

[8]  J. Šulc To the problem of emotional changes in human voice [proceedings]. , 1977, Activitas nervosa superior.

[9]  J. N. Bassili Emotion recognition: the role of facial movement and the relative importance of upper and lower areas of the face. , 1979, Journal of personality and social psychology.

[10]  R. V. Bezooijen Characteristics and recognizability of vocal expressions of emotion , 1984 .

[11]  R. Frick Communicating emotion: The role of prosodic features. , 1985 .

[12]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[13]  Garrison W. Cottrell,et al.  EMPATH: Face, Emotion, and Gender Recognition Using Holons , 1990, NIPS.

[14]  A. Ortony,et al.  What's basic about basic emotions? , 1990, Psychological review.

[15]  D. Matsumoto Cultural similarities and differences in display rules , 1990 .

[16]  Ben Shneiderman,et al.  Human values and the future of technology: a declaration of responsibility , 1991, SGCH.

[17]  Kenji Mase,et al.  Recognition of Facial Expression from Optical Flow , 1991 .

[18]  Eyal Yair,et al.  Super resolution pitch determination of speech signals , 1991, IEEE Trans. Signal Process..

[19]  Ashok Samal,et al.  Automatic recognition and analysis of human faces and facial expressions: a survey , 1992, Pattern Recognit..

[20]  Fumio Hara,et al.  Recognition of mixed facial expressions by neural network , 1992, [1992] Proceedings IEEE International Workshop on Robot and Human Communication.

[21]  Thomas S. Huang,et al.  Final Report To NSF of the Planning Workshop on Facial Expression Understanding , 1992 .

[22]  F. Fogelman Soulie,et al.  Multiresolution scene segmentation using MLPs , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[23]  Jeff T. Larsen,et al.  The psychophysiology of emotion. , 1993 .

[24]  Anna Wierzbicka,et al.  Reading human faces: Emotion components and universal semantics , 1993 .

[25]  B. Stein,et al.  The Merging of the Senses , 1993 .

[26]  Vicki Bruce What the human face tells the human mind: some challenges for the robot-human interface , 1993, Adv. Robotics.

[27]  G. Kearney,et al.  Machine Interpretation of Emotion: Design of a Memory‐Based Expert System for Interpreting Facial Expressions in Terms of Signaled Emotions , 1993 .

[28]  Fumio Hara,et al.  Recognition of Mixed Facial Expressions by Neural Network. , 1993 .

[29]  Akikazu Takeuchi,et al.  Communicative facial displays as a new conversational modality , 1993, INTERCHI.

[30]  A. Anderson,et al.  The Effects of Visibility on Dialogue and Performance in a Cooperative Problem Solving Task , 1994 .

[31]  J. Russell Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies. , 1994, Psychological bulletin.

[32]  Clifford Nass,et al.  Computers are social actors , 1994, CHI '94.

[33]  Larry Davis,et al.  Recognizing facial expressions by spatio-temporal analysis , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[34]  A. J. Fridlund IS THERE UNIVERSAL RECOGNITION OF EMOTION FROM FACIAL EXPRESSION? A REVIEW OF THE CROSS-CULTURAL STUDIES , 1994 .

[35]  M. Studdert-Kennedy Hand and Mind: What Gestures Reveal About Thought. , 1994 .

[36]  Kazuyo Tanaka Special Issue on Spoken Language Processing , 1995 .

[37]  Randolph R. Cornelius,et al.  The science of emotion: Research and tradition in the psychology of emotion. , 1997 .

[38]  Michael Studdert-Kennedy,et al.  The role of fundamental frequency in signaling linguistic stress and affect: Evidence for a dissociation , 1995, Perception & psychophysics.

[39]  John L. Arnott,et al.  Synthesizing emotions in speech: is it time to get excited? , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[40]  Ryohei Nakatsu,et al.  Life-like communication agent-emotion sensing character "MIC" and feeling session character "MUSE" , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[41]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[42]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[43]  D. Keltner Facial Expressions of Emotion and Personality , 1996 .

[44]  Garrison W. Cottrell,et al.  Representing Face Images for Emotion Classification , 1996, NIPS.

[45]  Clifford Nass,et al.  The media equation - how people treat computers, television, and new media like real people and places , 1996 .

[46]  Steve Mann,et al.  Wearable Computing: A First Step Toward Personal Imaging , 1997, Computer.

[47]  Rosalind W. Picard Affective Computing , 1997 .

[48]  Belur V. Dasarathy,et al.  Sensor fusion potential exploitation-innovative architectures and illustrative applications , 1997, Proc. IEEE.

[49]  Chung-Lin Huang,et al.  Facial Expression Recognition Using Model-Based Feature Extraction and Action Parameters Classification , 1997, J. Vis. Commun. Image Represent..

[50]  Alex Pentland,et al.  Coding, Analysis, Interpretation, and Recognition of Facial Expressions , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  J. Russell,et al.  The psychology of facial expression: Frontmatter , 1997 .

[52]  M. Yachida,et al.  Facial expression recognition and its degree estimation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[53]  Yasunari Yoshitomi,et al.  Facial expression recognition using thermal image processing and neural network , 1997, Proceedings 6th IEEE International Workshop on Robot and Human Communication. RO-MAN'97 SENDAI.

[54]  F. Hara,et al.  Facial interaction between animated 3D face robot and human beings , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[55]  L. de Silva,et al.  Facial emotion recognition using multi-modal information , 1997, Proceedings of ICICS, 1997 International Conference on Information, Communications and Signal Processing. Theme: Trends in Information Systems Engineering and Wireless Multimedia Communications (Cat..

[56]  S. Demleitner [Communication without words]. , 1997, Pflege aktuell.

[57]  Antonella De Angeli,et al.  Integration and synchronization of input modes during multimodal human-computer interaction , 1997, CHI.

[58]  Hartmut Neven,et al.  Online facial expression recognition based on personalized galleries , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[59]  Timothy F. Cootes,et al.  Face Recognition Using Active Appearance Models , 1998, ECCV.

[60]  Yang Li,et al.  Recognizing emotions in speech using short-term and long-term features , 1998, ICSLP.

[61]  Tsutomu Miyasato,et al.  Multimodal human emotion/expression recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[62]  R. Nakatsu,et al.  Toward the creation of a new medium for the multimedia era , 1998, Proc. IEEE.

[63]  Tsuhan Chen,et al.  Audio-visual integration in multimodal communication , 1998, Proc. IEEE.

[64]  N. Amir,et al.  Towards an automatic classification of emotions in speech , 1998, ICSLP.

[65]  Jun Ohya,et al.  Spotting segments displaying facial expression from image sequences using HMM , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[66]  Sumi Shigeno Cultural similarities and differences in the recognition of audio-visual speech stimuli , 1998, ICSLP.

[67]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[68]  Masahiko Yachida,et al.  Expression recognition from time-sequential facial images by use of expression change model , 1997, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[69]  Vladimir Pavlovic,et al.  Toward multimodal human-computer interface , 1998, Proc. IEEE.

[70]  Zhengyou Zhang,et al.  Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[71]  T. Sejnowski,et al.  Measuring facial expressions by computer image analysis. , 1999, Psychophysiology.

[72]  Ryohei Nakatsu,et al.  Emotion recognition and its application to computer agents with spontaneous interactive capabilities , 1999, MULTIMEDIA '99.

[73]  John H. L. Hansen,et al.  Speech under stress conditions: overview of the effect on speech production and on system performance , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[74]  Michael J. Lyons,et al.  Automatic Classification of Single Facial Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[75]  Narendra Ahuja,et al.  A SNoW-Based Face Detector , 1999, NIPS.

[76]  Ben Shneiderman,et al.  Human values and the future of technology: a declaration of responsibility , 1999, CSOC.

[77]  Brendan J. Frey,et al.  Embedded face and facial expression recognition , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[78]  Yukiko Kenmochi,et al.  Facial individuality and expression analysis by eigenspace method based on class features or multiple discriminant analysis , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[79]  Thomas S. Huang,et al.  Explanation-based facial motion tracking using a piecewise Bezier volume deformation model , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[80]  Anna Wierzbicka,et al.  Emotions Across Languages and Cultures: Diversity and Universals: Reading human faces , 1999 .

[81]  Marian Stewart Bartlett,et al.  Classifying Facial Actions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[82]  Mignon Park,et al.  Intelligent system for automatic adjustment of 3D facial shape model and face expression recognition , 1999, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315).

[83]  Thomas S. Huang,et al.  Exploiting the dependencies in information fusion , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[84]  Li Zhao,et al.  A study on emotional feature recognition in speech , 2000, INTERSPEECH.

[85]  Biing-Hwang Juang,et al.  Special issue on spoken language processing , 2000, Proceedings of the IEEE.

[86]  Maja Pantic,et al.  Automatic Analysis of Facial Expressions: The State of the Art , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[87]  Maja Pantic,et al.  Expert system for automatic analysis of facial expressions , 2000, Image Vis. Comput..

[88]  Diane J. Schiano,et al.  Face to interface: facial affect in (hu)man and machine , 2000, CHI.

[89]  Valery A. Petrushin,et al.  Emotion recognition in speech signal: experimental study, development, and application , 2000, INTERSPEECH.

[90]  Takeo Kanade,et al.  Introduction to the Special Section on Video Surveillance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[91]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[92]  Osamu Nakamura,et al.  The recognition of facial expressions with automatic detection of the reference face , 2000, 2000 Canadian Conference on Electrical and Computer Engineering. Conference Proceedings. Navigating to a New Era (Cat. No.00TH8492).

[93]  Stefanos D. Kollias,et al.  On emotion recognition of faces and of speech using neural networks, fuzzy logic and the ASSESS system , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[94]  Thomas S. Huang,et al.  Emotional expressions in audiovisual human computer interaction , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[95]  Justine Cassell,et al.  External manifestations of trustworthiness in the interface , 2000, CACM.

[96]  Ian Oakley,et al.  Putting the feel in ’look and feel‘ , 2000, CHI.

[97]  Yasunari Yoshitomi,et al.  Effect of sensor fusion for recognition of emotional states using voice, face image and thermal image of face , 2000, Proceedings 9th IEEE International Workshop on Robot and Human Interactive Communication. IEEE RO-MAN 2000 (Cat. No.00TH8499).

[98]  Judith S. Olson,et al.  i2i trust in e-commerce , 2000, CACM.

[99]  Elmar Nöth,et al.  Recognition of emotion in a realistic dialogue scenario , 2000, INTERSPEECH.

[100]  Chungyong Lee,et al.  Speaker dependent emotion recognition using speech signals , 2000, INTERSPEECH.

[101]  I. Marsic,et al.  Natural communication with information systems , 2000, Proceedings of the IEEE.

[102]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[103]  Alex Pentland,et al.  Looking at People: Sensing for Ubiquitous and Wearable Computing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[104]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[105]  Li-Qun Xu,et al.  User-oriented affective video content analysis , 2001, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL 2001).

[106]  L.C. De Silva,et al.  Speech based emotion classification , 2001, Proceedings of IEEE Region 10 International Conference on Electrical and Electronic Technology. TENCON 2001 (Cat. No.01CH37239).

[107]  Jennifer Healey,et al.  Toward Machine Emotional Intelligence: Analysis of Affective Physiological State , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[108]  Richard B. Reilly,et al.  Feature analysis for automatic speechreading , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).

[109]  Maja Pantic,et al.  Affect-Sensitive Multi-Modal Monitoring in Ubiquitous Computing: Advances and Challenges , 2001, ICEIS.

[110]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[111]  Yasue Mitsukura,et al.  Emotional speech classification with prosodic prameters by using neural networks , 2001, The Seventh Australian and New Zealand Intelligent Information Systems Conference, 2001.

[112]  Tsuhan Chen,et al.  Audiovisual speech processing , 2001, IEEE Signal Process. Mag..

[113]  Mahadev Satyanarayanan,et al.  Pervasive computing: vision and challenges , 2001, IEEE Wirel. Commun..

[114]  Maja Pantic,et al.  Facial expression analysis by computational intelligence techniques , 2001 .

[115]  Nicu Sebe,et al.  Emotion recognition using a Cauchy Naive Bayes classifier , 2002, Object recognition supported by user interaction for service robots.

[116]  Chi Chung Ko,et al.  Using moment invariants and HMM in facial expression recognition , 2002, Pattern Recognit. Lett..

[117]  Nicu Sebe,et al.  Facial expression recognition from video sequences , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[118]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[119]  Michael J. Black,et al.  Recognizing Facial Expressions in Image Sequences Using Local Parameterized Models of Image Motion , 1997, International Journal of Computer Vision.

[120]  Yiannis Aloimonos,et al.  Active vision , 2004, International Journal of Computer Vision.

[121]  P. Anandan,et al.  A computational framework and an algorithm for the measurement of visual motion , 1987, International Journal of Computer Vision.

[122]  Lakshmi S. Iyer,et al.  Trust in e-commerce , 2005, CACM.

[123]  Pascal Vasseur,et al.  Introduction to Multisensor Data Fusion , 2005, The Industrial Information Technology Handbook.