论文信息 - Cross-Lingual Transfer Learning for Affective Spoken Dialogue Systems

Cross-Lingual Transfer Learning for Affective Spoken Dialogue Systems

This paper presents a case study of cross-lingual transfer learning applied for affective computing in the domain of spoken dialogue systems. Prosodic features of correction dialog acts are modeled on a group of languages and compared with languages excluded from the analysis. Speech from different languages was recorded in carefully staged Wizard-of-Oz experiments, however, without the possibility to ensure balanced distribution of speakers per language. In order to assess the possibility of cross-lingual transfer learning and to ensure reliable classification of corrections independently of language, we employed different machine learning approaches along with relevant acoustic-prosodic features sets. The results of the experiments with mono-lingual corpora (trained and tested on a single language) and cross-lingual (trained on several languages and tested on the rest) were analyzed and compared in the terms of accuracy and F1 score.

Ivan Kraljevski | Aleksandar Gjoreski | Kristijan Gjoreski | Diane Hirschfeld

[1] Björn W. Schuller,et al. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing , 2016, IEEE Transactions on Affective Computing.

[2] Ivan Kraljevski,et al. Hyperarticulation of Corrections in Multilingual Dialogue Systems , 2017, INTERSPEECH.

[3] Klaus R. Scherer,et al. A psycho-ethological approach to social signal processing , 2012, Cognitive Processing.

[4] D. Bates,et al. Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[5] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[6] Björn W. Schuller,et al. Cross-language acoustic emotion recognition: An overview and some tendencies , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[7] Björn Schuller,et al. Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[8] Tieniu Tan,et al. Affective Computing: A Review , 2005, ACII.

[9] Maja Pantic,et al. Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[10] Björn W. Schuller,et al. The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[11] Pedro M. Domingos. A few useful things to know about machine learning , 2012, Commun. ACM.

[12] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[13] Thomas Fang Zheng,et al. Transfer learning for speech and language processing , 2015, 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[14] Taghi M. Khoshgoftaar,et al. A survey of transfer learning , 2016, Journal of Big Data.

[15] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16] F Dejong Estienne,et al. [Voice and emotion]. , 1991, Revue de laryngologie - otologie - rhinologie.