Real-time automatic emotion recognition from speech

In den vergangenen Jahren ist in der Mensch-Maschine-Kommunikation die Notwendigkeit, auf den emotionalen Zustand des Nutzers einzugehen, allgemein anerkannt worden. Um diesen automatisch zu erkennen, ist besonders Sprache in den Fokus geruckt. Bisher ging es dabei hauptsachlich um akademische und wenig anwendungsbezogene Untersuchungen, die auf im voraus aufgenommenen Datenbanken mit emotionaler Sprache beruhen. Die Anforderungen hierbei unterscheiden sich jedoch von denen der Online-Analyse, insbesondere sind im letzteren Fall die Bedingungen schwieriger und weniger vorhersagbar. Diese Dissertation beschaftigt sich mit der automatischen Erkennung von Emotionen aus Sprache in Echtzeit anhand akustischer Merkmale. Dazu wurden zunachst Experimente auf bestehenden Datenbanken mit emotionaler Sprache durchgefuhrt, um geeignete Methoden zur Segmentierung, Merkmalsextraktion und Klassifikation des Sprachsignals zu finden. Geeignet heist hierbei, dass die Methoden moglichst schnell und moglichst korrekt arbeiten. Um weitgehend allgemeingultige Ergebnisse zu erhalten, wurden die Experimente auf drei Datenbanken mit sehr unterschiedlichen Sprach- und Emotionstypen durchgefuhrt, namlich der Berlin Datenbank mit Emotionaler Sprache, dem FAU Aibo Emotionscorpus und dem SmartKom Mobile Corpus, die sowohl gelesene als auch spontane Sprache sowie gespielte und naturliche Emotionen enthalten. Die bei diesen Experimenten gewonnenen Erkenntnisse wurden dazu verwendet, eine umfassende Sammlung von Werkzeugen und Programmen zur Online- und Offline-Emotionserkennung, genannt EmoVoice, zu implementieren. Anhand von verschiedenen prototypischen Anwendungen und drei Benutzerstudien wurde die praktische Nutzbarkeit von EmoVoice, insbesondere auch durch externe Softwareentwickler, bewiesen. Weiterhin wurden vier Offline-Studien zur multimodalen Emotionserkennung durchgefuhrt, die akustische Merkmale mit Kontextinformation (Geschlecht), Biosignalen, Wortinformation und Mimik verbinden, da multimodale Erkennungsansatze eine hohere Erkennungsgenauigkeit versprechen.

[1]  Johannes Wagner,et al.  From Physiological Signals to Emotions: Implementing and Comparing Selected Methods for Feature Extraction and Classification , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[2]  Shrikanth S. Narayanan,et al.  Detecting Politeness and frustration state of a child in a conversational computer game , 2005, INTERSPEECH.

[3]  Shrikanth S. Narayanan,et al.  A detailed study of word-position effects on emotion expression in speech , 2009, INTERSPEECH.

[4]  S. Marsella,et al.  Assessing the validity of appraisal-based models of emotion , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[5]  Shohei Kato,et al.  Bayesian-Based Inference of Dialogist's Emotion for Sensitivity Robots , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[6]  Wei Wu,et al.  GMM Supervector Based SVM with Spectral Features for Speech Emotion Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7]  Takashi Nose,et al.  Emotional speech recognition based on style estimation and adaptation with multiple-regression HMM , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Stefan Steidl,et al.  Automatic classification of emotion related user states in spontaneous children's speech , 2009 .

[9]  李幼升,et al.  Ph , 1989 .

[10]  Andrew Ortony,et al.  The Cognitive Structure of Emotions , 1988 .

[11]  Christian Martyn Jones,et al.  Acoustic Emotion Recognition for Affective Computer Gaming , 2008, Affect and Emotion in Human-Computer Interaction.

[12]  Elmar Nöth,et al.  Tales of tuning - prototyping for automatic classification of emotional user states , 2005, INTERSPEECH.

[13]  Julia Sidorova,et al.  Speech Emotion Recognition With TGI+.2 Classifier , 2009, EACL.

[14]  A. Mehrabian Framework for a comprehensive description and measurement of emotional states. , 1995, Genetic, social, and general psychology monographs.

[15]  Johannes Wagner,et al.  Automatic Recognition of Emotions from Speech: A Review of the Literature and Recommendations for Practical Realisation , 2008, Affect and Emotion in Human-Computer Interaction.

[16]  Björn W. Schuller,et al.  Timing levels in segment-based speech emotion recognition , 2006, INTERSPEECH.

[17]  Elmar Nöth,et al.  “You Stupid Tin Box” - Children Interacting with the AIBO Robot: A Cross-linguistic Emotional Speech Corpus , 2004, LREC.

[18]  Johannes Wagner,et al.  A Systematic Comparison of Different HMM Designs for Emotion Recognition from Acted and Spontaneous Speech , 2007, ACII.

[19]  E. Velten A laboratory task for induction of mood states. , 1968, Behaviour research and therapy.

[20]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[21]  Emiel Krahmer,et al.  Real vs. acted emotional speech , 2006, INTERSPEECH.

[22]  Dirk Heylen,et al.  Towards responsive Sensitive Artificial Listeners , 2008 .

[23]  Roddy Cowie,et al.  What a neural net needs to know about emotion words , 1999 .

[24]  Elliot Moore,et al.  Investigating glottal parameters for differentiating emotional categories with similar prosodics , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25]  Christian Peter,et al.  Application of emotion recognition methods in automotive research , 2007 .

[26]  Gernot A. Fink Developing HMM-Based Recognizers with ESMERALDA , 1999, TSD.

[27]  Zhongzhe Xiao,et al.  Hierarchical Classification of Emotional Speech , 2007 .

[28]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Malcolm Slaney,et al.  Baby Ears: a recognition system for affective vocalizations , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[30]  P. Johnson-Laird,et al.  Towards a Cognitive Theory of Emotions , 1987 .

[31]  Ralf Kompe,et al.  Emotional space improves emotion recognition , 2002, INTERSPEECH.

[32]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[33]  Björn W. Schuller,et al.  Recognising interest in conversational speech - comparing bag of frames and supra-segmental features , 2009, INTERSPEECH.

[34]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[35]  A. Ortony,et al.  What's basic about basic emotions? , 1990, Psychological review.

[36]  Björn W. Schuller,et al.  Emotion recognition from speech: Putting ASR in the loop , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[37]  Johannes Wagner,et al.  Integrating information from speech and physiological signals to achieve emotional sensitivity , 2005, INTERSPEECH.

[38]  J. F. Kelley,et al.  An iterative design methodology for user-friendly natural language office information applications , 1984, TOIS.

[39]  Carlos Busso,et al.  Scripted dialogs versus improvisation: lessons learned about emotional elicitation techniques from the IEMOCAP database , 2008, INTERSPEECH.

[40]  J. Sichert Visualisierung des emotionalen Ausdrucks in der Stimme , 2008 .

[41]  Ing-Marie Jonsson,et al.  Automatic recognition of affective cues in the speech of car drivers to allow appropriate responses , 2005, OZCHI.

[42]  Hatice Gunes,et al.  Affect recognition from face and body: early fusion vs. late fusion , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[43]  Eva Hudlicka,et al.  To feel or not to feel: The role of affect in human-computer interaction , 2003, Int. J. Hum. Comput. Stud..

[44]  Krzysztof Slot,et al.  Emotion recognition in speech signal using emotion-extracting binary decision trees , 2007 .

[45]  Mann Oo. Hay Emotion recognition in human-computer interaction , 2012 .

[46]  Constantine Kotropoulos,et al.  Feature Selection Based on Mutual Correlation , 2006, CIARP.

[47]  Katsuya Yamauchi,et al.  Emotion clustering using the results of subjective opinion tests for emotion recognition in infants' cries , 2007, INTERSPEECH.

[48]  Jonghwa Kim,et al.  Ensemble Approaches to Parametric Decision Fusion for Bimodal Emotion Recognition , 2010, BIOSIGNALS.

[49]  Stephen E. Levinson,et al.  MENTAL STATE DETECTION OF DIALOGUE SYSTEM USERS VIA SPOKEN LANGUAGE , 2003 .

[50]  Catherine Pelachaud,et al.  Emotion markup language (EmotionML) 1.0. W3C last call working draft , 2011 .

[51]  Björn W. Schuller,et al.  Combining frame and turn-level information for robust recognition of emotions within speech , 2007, INTERSPEECH.

[52]  Björn Schuller,et al.  Effects of In-Car Noise-Conditions on the Recognition of Emotion within Speech , 2007 .

[53]  David A. van Leeuwen,et al.  Arousal and valence prediction in spontaneous emotional speech: felt versus perceived emotion , 2009, INTERSPEECH.

[54]  Joel R. Tetreault,et al.  Using system and user performance features to improve emotion detection in spoken tutoring dialogs , 2006, INTERSPEECH.

[55]  Catherine Pelachaud,et al.  From Greta's mind to her face: modelling the dynamics of affective states in a conversational embodied agent , 2003, Int. J. Hum. Comput. Stud..

[56]  Francesco Archetti,et al.  Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain , 2008, ICT4Justice.

[57]  Kostas Karpouzis,et al.  The HUMAINE Database: Addressing the Collection and Annotation of Naturalistic and Induced Emotional Data , 2007, ACII.

[58]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[59]  Loïc Kessous,et al.  Modeling naturalistic affective states via facial and vocal expressions recognition , 2006, ICMI '06.

[60]  Yoon Keun Kwak,et al.  Improved Emotion Recognition With a Novel Speaker-Independent Feature , 2009, IEEE/ASME Transactions on Mechatronics.

[61]  Thomas Fang Zheng,et al.  Study on speaker verification on emotional speech , 2006, INTERSPEECH.

[62]  Colin Yallop,et al.  An Introduction to Phonetics and Phonology , 1990 .

[63]  Emiel Krahmer,et al.  Audiovisual emotional speech of game playing children: effects of age and culture , 2007, INTERSPEECH.

[64]  Kornel Laskowski,et al.  Emotion recognition in spontaneous speech using GMMs , 2006, INTERSPEECH.

[65]  Bin Yang,et al.  Cascaded emotion classification via psychological emotion dimensions using a large set of voice quality parameters , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[66]  P. Ekman What emotion categories or dimensions can observers judge from facial behavior , 1982 .

[67]  Elisabeth André,et al.  Exploring the benefits of discretization of acoustic features for speech emotion recognition , 2009, INTERSPEECH.

[68]  Elisabeth André,et al.  Affect sensing in speech: Studying fusion of linguistic and acoustic features , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[69]  Marc Cavazza,et al.  An affective model of user experience for interactive art , 2008, ACE '08.

[70]  Elisabeth André,et al.  Differentiated semantic analysis in lexical affect sensing , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[71]  Elisabeth André,et al.  Emotion recognition based on physiological changes in music listening , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Klaus R. Scherer,et al.  Using Actor Portrayals to Systematically Study Multimodal Emotion Expression: The GEMEP Corpus , 2007, ACII.

[73]  Björn W. Schuller,et al.  Balancing spoken content adaptation and unit length in the recognition of emotion and interest , 2008, INTERSPEECH.

[74]  Alex Waibel,et al.  EMOTION-SENSITIVE HUMAN-COMPUTER INTERFACES , 2000 .

[75]  Mehryar Mohri,et al.  Voice signatures , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[76]  Björn W. Schuller,et al.  OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[77]  D. V. Leeuwen,et al.  An open-set detection evaluation methodology for automatic emotion recognition in speech , 2007 .

[78]  Vera Kempe,et al.  Exploring the Influence of Vocal Emotion Expression on Communicative Effectiveness , 2005, Phonetica.

[79]  Björn W. Schuller,et al.  Towards More Reality in the Recognition of Emotional Speech , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[80]  Constantine Kotropoulos,et al.  Gender classification in two Emotional Speech databases , 2008, 2008 19th International Conference on Pattern Recognition.

[81]  Stuart M. Shieber,et al.  Identifying uncertain words within an utterance via prosodic features , 2009, INTERSPEECH.

[82]  Elmar Nöth,et al.  We are not amused - but how do you know? user states in a multi-modal dialogue system , 2003, INTERSPEECH.

[83]  Chloé Clavel,et al.  Detection and Analysis of Abnormal Situations Through Fear-Type Acoustic Manifestations , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[84]  Björn W. Schuller,et al.  The hinterland of emotions: Facing the open-microphone challenge , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[85]  Rebecca Hwa,et al.  Co-training for Predicting Emotions with Spoken Dialogue Data , 2004, ACL.

[86]  Inma Hernáez,et al.  Combining spectral and prosodic information for emotion recognition in the interspeech 2009 emotion challenge , 2009, INTERSPEECH.

[87]  Christine L. Lisetti,et al.  Toward multimodal fusion of affective cues , 2006, HCM '06.

[88]  Cristina Conati,et al.  Modeling Students' Emotions from Cognitive Appraisal in Educational Games , 2002, Intelligent Tutoring Systems.

[89]  Roddy Cowie,et al.  Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[90]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[91]  Pavel Paclík,et al.  Adaptive floating search methods in feature selection , 1999, Pattern Recognit. Lett..

[92]  Dimitrios Ververidis,et al.  A State of the Art Review on Emotional Speech Databases , 2003 .

[93]  M. Schröder First Suggestions for an Emotion Annotation and Representation Language , 2006 .

[94]  Ruhi Sarikaya,et al.  Subband based classification of speech under stress , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[95]  Pierre Dumouchel,et al.  Cepstral and long-term features for emotion recognition , 2009, INTERSPEECH.

[96]  Oh-Wook Kwon,et al.  EMOTION RECOGNITION BY SPEECH SIGNAL , 2003 .

[97]  Carlos Busso,et al.  Emotion recognition using a hierarchical binary decision tree approach , 2011, Speech Commun..

[98]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[99]  Michael Wagner,et al.  Multimodal speaker verification using ancillary known speaker characteristics such as gender or age , 2009, INTERSPEECH.

[100]  Kristina Höök,et al.  SenToy: an affective sympathetic interface , 2003, Int. J. Hum. Comput. Stud..

[101]  C. Darwin,et al.  The Expression of the Emotions in Man and Animals , 1872 .

[102]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[103]  Clore Ortony Integrating the OCC Model of Emotions in Embodied Characters , 2002 .

[104]  Javier Macías Guarasa,et al.  Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions , 2009, INTERSPEECH.

[105]  Carlos Busso,et al.  The expression and perception of emotions: comparing assessments of self versus others , 2008, INTERSPEECH.

[106]  Elizabeth Shriberg,et al.  Spotting "hot spots" in meetings: human judgments and prosodic cues , 2003, INTERSPEECH.

[107]  Carlos Busso,et al.  Using neutral speech models for emotional speech analysis , 2007, INTERSPEECH.

[108]  Elmar Nöth,et al.  A Taxonomy of Applications that Utilize Emotional Awareness , 2006 .

[109]  Christian Kaernbach,et al.  Amplitude and amplitude variation of emotional speech , 2008, INTERSPEECH.

[110]  Craig A. Smith,et al.  Appraisal theory: Overview, assumptions, varieties, controversies. , 2001 .

[111]  K. Karpouzis,et al.  Adaptation of facial feature extraction and rule generation in emotion-analysis systems , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[112]  D. Datcu,et al.  The recognition of emotions from speech using GentleBoost classifier: A comparison approach , 2006 .

[113]  John H. L. Hansen,et al.  Getting started with SUSAS: a speech under simulated and actual stress database , 1997, EUROSPEECH.

[114]  Joan Claudi Socoró,et al.  GTM-URL contribution to the INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[115]  Angela D. Friederici,et al.  Affective encoding in the speech signal and in event-related brain potentials , 2003, Speech Commun..

[116]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[117]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .

[118]  Loïc Kessous,et al.  The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals , 2007, INTERSPEECH.

[119]  Tiago H. Falk,et al.  Long-term spectro-temporal information for improved automatic speech emotion classification , 2008, INTERSPEECH.

[120]  Diane J. Litman,et al.  Classifying turn-level uncertainty using word-level prosody , 2009, INTERSPEECH.

[121]  W. Abdulla,et al.  Improving speech recognition performance through gender separation , 1988 .

[122]  Chun Chen,et al.  Manifolds Based Emotion Recognition in Speech , 2007, ROCLING/IJCLCLP.

[123]  Marc Cavazza,et al.  Affective Interactive Narrative in the CALLAS Project , 2007, International Conference on Virtual Storytelling.

[124]  Shrikanth S. Narayanan,et al.  Combining categorical and primitives-based emotion recognition , 2006, 2006 14th European Signal Processing Conference.

[125]  Marc Hanheide,et al.  Automatic Initialization for Facial Analysis in Interactive Robotics , 2008, ICVS.

[126]  Jennifer Healey,et al.  Toward Machine Emotional Intelligence: Analysis of Affective Physiological State , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[127]  Carlos Busso,et al.  Modeling mutual influence of interlocutor emotion states in dyadic spoken interactions , 2009, INTERSPEECH.

[128]  Hairong Lv,et al.  Emotion recognition based on pressure sensor keyboards , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[129]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[130]  K. Scherer Criteria for emotion-antecedent appraisal: A review. , 1988 .

[131]  Christine L. Lisetti,et al.  Toward Recognizing Individual's Subjective Emotion from Physiological Signals in Practical Application , 2007, Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07).

[132]  Zhigang Deng,et al.  Analysis of emotion recognition using facial expressions, speech and multimodal information , 2004, ICMI '04.

[133]  R. Gibbs,et al.  What's Special About Figurative Language in Emotional Communication? , 2002 .

[134]  Elmar Nöth,et al.  "Of all things the measure is man" automatic classification of emotions and inter-labeler consistency [speech-based emotion recognition] , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[135]  Björn W. Schuller,et al.  On the Necessity and Feasibility of Detecting a Driver's Emotional State While Driving , 2007, ACII.

[136]  Abdul Wahab,et al.  CMAC for speech emotion profiling , 2009, INTERSPEECH.

[137]  Elisabeth André,et al.  Comparing Feature Sets for Acted and Spontaneous Speech in View of Automatic Emotion Recognition , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[138]  R. Plutchik The emotions: Facts, theories and a new model. , 1964 .

[139]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[140]  Diane J. Litman,et al.  Exploiting Word-level Features for Emotion Prediction , 2006, 2006 IEEE Spoken Language Technology Workshop.

[141]  E. M. Wright,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[142]  Elmar Nöth,et al.  Boiling down prosody for the classification of boundaries and accents in German and English , 2001, INTERSPEECH.

[143]  Shrikanth S. Narayanan,et al.  The Vera am Mittag German audio-visual emotional speech database , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[144]  Yuxiao Hu,et al.  Audio-visual emotion recognition in adult attachment interview , 2006, ICMI '06.

[145]  Panayiotis G. Georgiou,et al.  Real-time Emotion Detection System using Speech: Multi-modal Fusion of Different Timescale Features , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[146]  Fatma Nasoz,et al.  Emotion Recognition from Physiological Signals for Presence Technologies , 2004 .

[147]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[148]  Changxue Ma,et al.  Toward A Speaker-Independent Real-Time Affect Detection System , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[149]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[150]  Rosalind W. Picard,et al.  Classical and novel discriminant features for affect recognition from speech , 2005, INTERSPEECH.

[151]  Elisabeth André,et al.  Improving Automatic Emotion Recognition from Speech via Gender Differentiaion , 2006, LREC.

[152]  Yuxiao Hu,et al.  Spontaneous Emotional Facial Expression Detection , 2006, J. Multim..

[153]  Khiet P. Truong,et al.  Automatic Recognition of Spontaneous Emotions in Speech Using Acoustic and Lexical Features , 2008, MLMI.

[154]  Marko Lugger,et al.  AN INCREMENTAL ANALYSIS OF DIFFERENT FEATURE GROUPS IN SPEAKER INDEPENDENT EMOTION RECOGNITION , 2007 .

[155]  Ira J. Roseman Appraisal Determinants of Emotions: Constructing a More Accurate and Comprehensive Theory , 1996 .

[156]  Vidhyasaharan Sethu,et al.  Pitch contour parameterisation based on linear stylisation for emotion recognition , 2009, INTERSPEECH.

[157]  Basilio Sierra,et al.  Feature Subset Selection Based on Evolutionary Algorithms for Automatic Emotion Recognition in Spoken Spanish and Standard Basque Language , 2006, TSD.

[158]  Astrid Paeschke,et al.  Articulatory reduction in emotional speech , 1999, EUROSPEECH.

[159]  Alex Waibel,et al.  Detecting Emotions in Speech , 1998 .

[160]  Florian Schiel,et al.  The SmartKom Multimodal Corpus at BAS , 2002, LREC.

[161]  Dirk Heylen,et al.  The Sensitive Artificial Listner: an induction technique for generating emotionally coloured conversation , 2008 .

[162]  Colin Potts,et al.  Design of Everyday Things , 1988 .

[163]  Nicu Sebe,et al.  Emotion recognition using a Cauchy Naive Bayes classifier , 2002, Object recognition supported by user interaction for service robots.

[164]  Yasunari Obuchi,et al.  Emotion Recognition using Mel-Frequency Cepstral Coefficients , 2007 .

[165]  David A. van Leeuwen,et al.  Assessing agreement of observer- and self-annotations in spontaneous multimodal emotion data , 2008, INTERSPEECH.

[166]  Rosalind W. Picard,et al.  Modeling drivers' speech under stress , 2003, Speech Commun..

[167]  Jannik Fritsch,et al.  Humanoid robot platform suitable for studying embodied interaction , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[168]  P. Mermelstein,et al.  Distance measures for speech recognition, psychological and instrumental , 1976 .

[169]  Lori Lamel,et al.  Challenges in real-life emotion annotation and machine learning based detection , 2005, Neural Networks.

[170]  Loong Fah Cheong,et al.  Affective understanding in film , 2006, IEEE Trans. Circuits Syst. Video Technol..

[171]  Guoyin Wang,et al.  Speech Emotion Recognition Based on Rough Set and SVM , 2006, 2006 5th IEEE International Conference on Cognitive Informatics.

[172]  Giulio Jacucci,et al.  ElectroEmotion — A tool for producing emotional corpora collaboratively , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[173]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[174]  Johannes Wagner,et al.  Smart sensor integration: A framework for multimodal emotion recognition in real-time , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[175]  S. Schnall The pragmatics of emotion language , 2005 .

[176]  Oscar Déniz-Suárez,et al.  ENCARA2: Real-time detection of multiple faces at different resolutions in video streams , 2007, J. Vis. Commun. Image Represent..

[177]  Ken Chen,et al.  Research on Speech Emotion Recognition System in E-Learning , 2007, International Conference on Computational Science.

[178]  Björn W. Schuller,et al.  Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles , 2005, INTERSPEECH.

[179]  Alessandra Russo,et al.  Speech Emotion Classification Using Machine Learning Algorithms , 2008, 2008 IEEE International Conference on Semantic Computing.

[180]  Björn W. Schuller,et al.  Hidden Markov model-based speech emotion recognition , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[181]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[182]  Bin Yang,et al.  On the relevance of high-level features for speaker independent emotion recognition of spontaneous speech , 2009, INTERSPEECH.

[183]  Björn W. Schuller,et al.  The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[184]  Kjell Elenius,et al.  Automatic recognition of anger in spontaneous speech , 2008, INTERSPEECH.

[185]  Zhigang Deng,et al.  Emotion recognition based on phoneme classes , 2004, INTERSPEECH.

[186]  Andreas Stolcke,et al.  Prosody-based automatic detection of annoyance and frustration in human-computer dialog , 2002, INTERSPEECH.

[187]  Florian Schiel,et al.  Development of the UserState Conventions for the Multimodal Corpus in SmartKom , 2002 .

[188]  Plínio A. Barbosa,et al.  Detecting changes in speech expressiveness in participants of a radio program , 2009, INTERSPEECH.

[189]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[190]  Marc Cavazza,et al.  Emotional input for character-based interactive storytelling , 2009, AAMAS.

[191]  Chang Dong Yoo,et al.  Speech emotion recognition via a max-margin framework incorporating a loss function based on the Watson and Tellegen's emotion model , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[192]  Robert I. Damper,et al.  Emotion recognition from speech using extended feature selection and a simple classifier , 2009, INTERSPEECH.

[193]  Christian Bauckhage,et al.  An XML based framework for cognitive vision architectures , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[194]  Kwang-Seok Hong,et al.  Speech Emotion Recognition Using Spectral Entropy , 2008, ICIRA.

[195]  Simon Dobri Emotion Recognition using Linear Transformations in Combination with Video , 2009 .

[196]  Elisabeth André,et al.  Exploring emotions and multimodality in digitally augmented puppeteering , 2008, AVI '08.

[197]  Klaus R. Scherer,et al.  Vocal communication of emotion: A review of research paradigms , 2003, Speech Commun..

[198]  Andreas Kießling,et al.  Extraktion und Klassifikation prosodischer Merkmale in der automatischen Sprachverarbeitung / Andreas Kiessling , 1997 .

[199]  Jiucang Hao,et al.  Emotion recognition by speech signals , 2003, INTERSPEECH.

[200]  Tapio Seppänen,et al.  Prosody-based classification of emotions in spoken finnish , 2003, INTERSPEECH.

[201]  F. Burkhardt,et al.  An Emotion-Aware Voice Portal , 2005 .

[202]  Ann Morrison,et al.  Bodily Explorations in Space: Social Experience of a Multimodal Art Installation , 2009, INTERACT.

[203]  T Hacki,et al.  Development of the child's voice: premutation, mutation. , 1999, International journal of pediatric otorhinolaryngology.

[204]  Eric Fosler-Lussier,et al.  Combining multiple estimators of speaking rate , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[205]  Carlos Busso,et al.  Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[206]  Britta Wrede,et al.  Playing a different imitation game: Interaction with an Empathic Android Robot , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[207]  Ragini Verma,et al.  Improving emotion recognition using class-level spectral features , 2009, INTERSPEECH.

[208]  Elmar Nöth,et al.  Private emotions versus social interaction: a data-driven approach towards analysing emotion in speech , 2008, User Modeling and User-Adapted Interaction.

[209]  Constantine Kotropoulos,et al.  Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections , 2006, 2006 14th European Signal Processing Conference.

[210]  Marc Cavazza,et al.  E-tree: emotionally driven augmented reality art , 2008, ACM Multimedia.

[211]  Diane J. Litman,et al.  Predicting Emotion in Spoken Dialogue from Multiple Knowledge Sources , 2004, NAACL.

[212]  Valery A. Petrushin Creating Emotion Recognition Agents for Speech Signal , 2002 .

[213]  Louis C. W. Pols,et al.  An acoustic description of consonant reduction , 1999, Speech Commun..

[214]  Pierre-Yves Oudeyer,et al.  The production and recognition of emotions in speech: features and algorithms , 2003, Int. J. Hum. Comput. Stud..

[215]  Clark Elliott,et al.  Autonomous Agents as Synthetic Characters , 1998, AI Mag..

[216]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[217]  Ing-Marie Jonsson,et al.  Using Paralinguistic Cues in Speech to Recognise Emotions in Older Car Drivers , 2008, Affect and Emotion in Human-Computer Interaction.

[218]  Matthias Rehm,et al.  Dancing the night away: controlling a virtual karaoke dancer by multimodal expressive cues , 2008, AAMAS.

[219]  Elmar Nöth,et al.  From Emotion to Interaction: Lessons from Real Human-Machine-Dialogues , 2004, ADS.

[220]  N. Amir,et al.  Analysis of an emotional speech corpus in Hebrew based on objective criteria , 2000 .

[221]  Sascha Fagel Emotional McGurk Effect , 2006 .

[222]  P. Alku,et al.  The role of F3 in the vocal expression of emotions , 2006, Logopedics, phoniatrics, vocology.

[223]  Marc Hanheide,et al.  Evaluation and Discussion of Multi-modal Emotion Recognition , 2009, 2009 Second International Conference on Computer and Electrical Engineering.

[224]  Phoebe Sengers,et al.  The Three Paradigms of HCI , 2007 .

[225]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[226]  Björn W. Schuller,et al.  Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies , 2008, INTERSPEECH.

[227]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[228]  Elmar Nöth,et al.  How to find trouble in communication , 2003, Speech Commun..

[229]  Craig A. Smith,et al.  Patterns of cognitive appraisal in emotion. , 1985, Journal of personality and social psychology.

[230]  Roddy Cowie,et al.  Multimodal databases of everyday emotion: facing up to complexity , 2005, INTERSPEECH.

[231]  Bernd Kleinjohann,et al.  Fuzzy emotion recognition in natural speech dialogue , 2005, ROMAN 2005. IEEE International Workshop on Robot and Human Interactive Communication, 2005..

[232]  Elmar Nöth,et al.  Real-time Recognition of the Affective User State with Physiological Signals , 2022 .

[233]  Stephen E. Levinson,et al.  Cognitive state classification in a spoken tutorial dialogue system , 2006, Speech Commun..

[234]  Fabio Pianesi,et al.  A first evaluation study of a database of kinetic facial expressions (DaFEx) , 2005, ICMI '05.

[235]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[236]  Constantine Kotropoulos,et al.  Automatic speech classification to five emotional states based on gender information , 2004, 2004 12th European Signal Processing Conference.

[237]  Diane J. Litman,et al.  Predicting Student Emotions in Computer-Human Tutoring Dialogues , 2004, ACL.

[238]  Klaus R. Scherer,et al.  Emotion dimensions and formant position , 2009, INTERSPEECH.

[239]  Kornel Laskowski,et al.  Combining Efforts for Improving Automatic Classification of Emotional User States , 2006 .

[240]  Mari Ostendorf,et al.  Detection Of Agreement vs. Disagreement In Meetings: Training With Unlabeled Data , 2003, NAACL.

[241]  Chun Chen,et al.  Speech Emotion Recognition using an Enhanced Co-Training Algorithm , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[242]  Björn W. Schuller,et al.  Combining speech recognition and acoustic word emotion models for robust text-independent emotion recognition , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[243]  Christian Martyn Jones,et al.  Affective Human-Robotic Interaction , 2008, Affect and Emotion in Human-Computer Interaction.

[244]  Maja J. Mataric,et al.  Evaluating evaluators: a case study in understanding the benefits and pitfalls of multi-evaluator modeling , 2009, INTERSPEECH.

[245]  Sergios Theodoridis,et al.  A dimensional approach to emotion recognition of speech from movies , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.