论文信息 - The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism

The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism

The INTERSPEECH 2013 Computational Paralinguistics Challenge provides for the first time a unified test-bed for Social Signals such as laughter in speech. It further introduces conflict in group discussions as a new task and deals with autism and its manifestations in speech. Finally, emotion is revisited as task, albeit with a broader range of overall twelve enacted emotional states. In this paper, we describe these four Sub-Challenges, their conditions, baselines, and a new feature set by the openSMILE toolkit, provided to the participants. Index Terms: Computational Paralinguistics, Challenge, Social Signals, Conflict, Emotion, Autism

[1] Fabio Valente,et al. Predicting the conflict level in television political debates: an approach based on crowdsourcing, nonverbal communication and gaussian processes , 2012, ACM Multimedia.

[2] Dietmar Todt,et al. Laughter in Conversation: Features of Occurrence and Acoustic Structure , 2004 .

[3] K. Scherer,et al. Introducing the Geneva Multimodal expression corpus for experimental research on emotion perception. , 2012, Emotion.

[4] Emily Tucker Prud'hommeaux,et al. Computational prosodic markers for autism. , 2010, Autism : the international journal of research and practice.

[5] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[6] Björn W. Schuller,et al. Medium-term speaker states - A review on intoxication, sleepiness and the first challenge , 2014, Comput. Speech Lang..

[7] Björn W. Schuller,et al. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[8] Björn Schuller,et al. Computational Paralinguistics , 2013 .

[9] K. Fischer,et al. DESPERATELY SEEKING EMOTIONS OR: ACTORS, WIZARDS, AND HUMAN BEINGS , 2000 .

[10] Sylvain Meignier,et al. LIUM SPKDIARIZATION: AN OPEN SOURCE TOOLKIT FOR DIARIZATION , 2010 .

[11] Shrikanth S. Narayanan,et al. Spontaneous-Speech Acoustic-Prosodic Features of Children with Autism and the Interacting Psychologist , 2012, INTERSPEECH.

[12] Björn Schuller,et al. The Computational Paralinguistics Challenge , 2012 .

[13] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[14] Steve Young,et al. The HTK book , 1995 .

[15] Antonio Origlia,et al. From Nonverbal Cues to Perception: Personality and Social Attractiveness , 2011, COST 2102 Training School.

[16] Fabien Ringeval,et al. Automatic Intonation Recognition for the Prosodic Assessment of Language-Impaired Children , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[17] H. H. Clark,et al. Using uh and um in spontaneous speaking , 2002, Cognition.

[18] Mert Bay,et al. The Music Information Retrieval Evaluation eXchange: Some Observations and Insights , 2010, Advances in Music Information Retrieval.

[19] Alvin F. Martin,et al. NIST speaker recognition evaluation chronicles , 2004, Odyssey.

[20] Nick Campbell,et al. Acoustic Features of Four Types of Laughter in Natural Conversational Speech , 2011, ICPhS.

[21] Thomas Mandl,et al. Recent Developments in the Evaluation of Information Retrieval Systems: Moving Towards Diversity and Practical Relevance , 2008, Informatica.

[22] Björn Schuller,et al. Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[23] Björn W. Schuller,et al. Paralinguistics in speech and language - State-of-the-art and the challenge , 2013, Comput. Speech Lang..

[24] Elisabeth André,et al. Comparing Feature Sets for Acted and Spontaneous Speech in View of Automatic Emotion Recognition , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[25] J. Bachorowski,et al. The acoustic features of human laughter. , 2001, The Journal of the Acoustical Society of America.

[26] Sue Peppé,et al. Prosody in autism spectrum disorders: a critical review. , 2003, International journal of language & communication disorders.

[27] Maja Pantic,et al. Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[28] Elmar Nöth,et al. The INTERSPEECH 2012 Speaker Trait Challenge , 2012, INTERSPEECH.

[29] Kenneth Tobin,et al. Solidarity and conflict: aligned and misaligned prosody as a transactional resource in intra- and intercultural communication involving power differences , 2009 .