A preliminary study of cross-lingual emotion recognition from speech: automatic classification versus human perception

The aim of this study is to investigate the effect of cross-lingual data on human perception and automatic classification of emotion from speech. We use four different databases from three languages (English, Chinese, and German) and two types (acted and improvised). For automatic classification, there is a significant degradation using cross-corpus than within-corpus setup. For human perception, we observe differences between native and non-native speakers when judging emotions for a language, and there is less performance loss in cross-language setup compared to automatic classification. In addition, we find that the automatic approaches work well in classifying the emotional activation category: positive and negative activated emotions, but are not good at classifying instances within the same activation category, which is different from the confusion patterns of the human perception experiment. This study provides insights to better understanding of cross-lingual human emotion perception and development of robust automatic emotion recognition systems.

[1]  Carlos Busso,et al.  IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.

[2]  Tim Polzehl,et al.  Approaching Multi-Lingual Emotion Recognition from Speech - On Language Dependency of Acoustic/Prosodic Features for Anger Detection , 2010 .

[3]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[4]  Shrikanth S. Narayanan,et al.  An articulatory study of emotional speech production , 2005, INTERSPEECH.

[5]  Björn W. Schuller,et al.  The INTERSPEECH 2010 paralinguistic challenge , 2010, INTERSPEECH.

[6]  Werner Verhelst,et al.  Automatic Classification of Expressiveness in Speech: A Multi-corpus Study , 2007, Speaker Classification.

[7]  Daniel Neiberg,et al.  Intra-, Inter-, and Cross-Cultural Classification of Vocal Affect , 2011, INTERSPEECH.

[8]  Ian H. Witten,et al.  Weka: Practical machine learning tools and techniques with Java implementations , 1999 .

[9]  Björn Schuller,et al.  Cross-Corpus Classification of Realistic Emotions - Some Pilot Experiments , 2010, LREC 2010.

[10]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[11]  Björn W. Schuller,et al.  OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[12]  Hillary Anger Elfenbein,et al.  On the universality and cultural specificity of emotion recognition: a meta-analysis. , 2002, Psychological bulletin.