AVEC 2019 Workshop and Challenge: State-of-Mind, Detecting Depression with AI, and Cross-Cultural Affect Recognition

The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) 'State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition' is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the health and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of various approaches to health and emotion recognition from real-life data. This paper presents the major novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline systems on the three proposed tasks: state-of-mind recognition, depression assessment with AI, and cross-cultural affect sensing, respectively.

[1]  Thomas F. Quatieri,et al.  A review of depression and suicide risk assessment using speech analysis , 2015, Speech Commun..

[2]  Fabien Ringeval,et al.  AVEC 2015: The 5th International Audio/Visual Emotion Challenge and Workshop , 2015, ACM Multimedia.

[3]  Björn W. Schuller,et al.  The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing , 2016, IEEE Transactions on Affective Computing.

[4]  Björn W. Schuller,et al.  Snore Sound Classification Using Image-Based Deep Spectrum Features , 2017, INTERSPEECH.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Fabien Ringeval,et al.  SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  P. Ekman Universals and cultural differences in facial expressions of emotion. , 1972 .

[8]  Ting Dang,et al.  Speech-based Continuous Emotion Prediction by Learning Perception Responses related to Salient Events: A Study based on Vocal Affect Bursts and Cross-Cultural Affect in AVEC 2018 , 2018, AVEC@MM.

[9]  J. Pennebaker,et al.  Psychological aspects of natural language. use: our words, our selves. , 2003, Annual review of psychology.

[10]  George Trigeorgis,et al.  Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Guoying Zhao,et al.  Deep Affect Prediction in-the-Wild: Aff-Wild Database and Challenge, Deep Architectures, and Beyond , 2018, International Journal of Computer Vision.

[12]  Reza Lotfian,et al.  Curriculum Learning for Speech Emotion Recognition From Crowdsourced Labels , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13]  Michael Wagner,et al.  Multimodal assistive technologies for depression diagnosis and monitoring , 2013, Journal on Multimodal User Interfaces.

[14]  Elad Yom-Tov,et al.  Predicting user adherence to behavioral eHealth interventions in the real world: examining which aspects of intervention design matter most. , 2018, Translational behavioral medicine.

[15]  M. Houben,et al.  The relation between short-term emotion dynamics and psychological well-being: A meta-analysis. , 2015, Psychological bulletin.

[16]  Fabien Ringeval,et al.  Summary for AVEC 2018: Bipolar Disorder and Cross-Cultural Affect Recognition , 2018, ACM Multimedia.

[17]  G. Clore,et al.  Mood, misattribution, and judgments of well-being: Informative and directive functions of affective states. , 1983 .

[18]  Emily Mower Provost,et al.  Cross-Corpus Acoustic Emotion Recognition with Multi-Task Learning: Seeking Common Ground While Preserving Differences , 2019, IEEE Transactions on Affective Computing.

[19]  H. Maturana,et al.  The Tree of Knowledge: The Biological Roots of Human Understanding , 2007 .

[20]  Albert A. Rizzo,et al.  Automatic audiovisual behavior descriptors for psychological disorder analysis , 2014, Image Vis. Comput..

[21]  Björn W. Schuller,et al.  Advanced Data Exploitation in Speech Analysis: An overview , 2017, IEEE Signal Processing Magazine.

[22]  Harald Baumeister,et al.  Context Modelling Using Hierarchical Attention Networks for Sentiment and Self-assessed Emotion Detection in Spoken Narratives , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Björn W. Schuller,et al.  Workshop summary for the 3rd international audio/visual emotion challenge and workshop (AVEC'13) , 2013, MM '13.

[24]  Kerstin Dautenhahn,et al.  The origins of narrative: In search of the transactional format of narratives in humans and other social animals , 2002 .

[25]  Fernando De la Torre,et al.  Detecting depression from facial actions and vocal prosody , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[26]  Carl Vogel,et al.  Needs and challenges in human computer interaction for processing social emotional information , 2015, Pattern Recognit. Lett..

[27]  Christian Poellabauer,et al.  Topic Modeling Based Multi-modal Depression Detection , 2017, AVEC@ACM Multimedia.

[28]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[29]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[30]  Louis-Philippe Morency,et al.  OpenFace 2.0: Facial Behavior Analysis Toolkit , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[31]  Björn W. Schuller,et al.  On Many-to-Many Mapping Between Concordance Correlation Coefficient and Mean Square Error , 2019, ArXiv.

[32]  Thomas F. Quatieri,et al.  Vocal biomarkers of depression based on motor incoordination , 2013, AVEC@ACM Multimedia.

[33]  Heysem Kaya,et al.  Efficient and effective strategies for cross-corpus acoustic emotion recognition , 2018, Neurocomputing.

[34]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[35]  Fabien Ringeval,et al.  At the Border of Acoustics and Linguistics: Bag-of-Audio-Words for the Recognition of Emotions in Speech , 2016, INTERSPEECH.

[36]  Stefan Scherer,et al.  A Cross-modal Review of Indicators for Depression Detection Systems , 2017, CLPsych@ACL.

[37]  Björn W. Schuller,et al.  State of Mind: Classification through Self-reported Affect and Word Use in Speech , 2018, INTERSPEECH.

[38]  Sidney K. D'Mello,et al.  A Review and Meta-Analysis of Multimodal Affect Detection Systems , 2015, ACM Comput. Surv..

[39]  Hillary Anger Elfenbein,et al.  On the universality and cultural specificity of emotion recognition: a meta-analysis. , 2002, Psychological bulletin.

[40]  Björn W. Schuller,et al.  Cross lingual speech emotion recognition using canonical correlation analysis on principal component subspace , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[41]  D. MacInnis,et al.  Understanding Program-Induced Mood Effects: Decoupling Arousal from Valence , 2002 .

[42]  Fabien Ringeval,et al.  AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge , 2016, AVEC@ACM Multimedia.

[43]  Dacher Keltner,et al.  Universals and Cultural Variations in 22 Emotional Expressions Across Five Cultures , 2018, Emotion.

[44]  Björn W. Schuller,et al.  AVEC 2012: the continuous audio/visual emotion challenge , 2012, ICMI '12.

[45]  Fabien Ringeval,et al.  Summary for AVEC 2017: Real-life Depression and Affect Challenge and Workshop , 2017, ACM Multimedia.

[46]  Jian Huang,et al.  Multimodal Continuous Emotion Recognition with Data Augmentation Using Recurrent Neural Networks , 2018, AVEC@MM.

[47]  David DeVault,et al.  The Distress Analysis Interview Corpus of human and computer interviews , 2014, LREC.

[48]  R. Sapolsky Social Status and Health in Humans and Other Animals , 2004 .

[49]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Björn W. Schuller,et al.  The INTERSPEECH 2018 Computational Paralinguistics Challenge: Atypical & Self-Assessed Affect, Crying & Heart Beats , 2018, INTERSPEECH.

[51]  K. Scherer,et al.  Emotion Inferences from Vocal Expression Correlate Across Languages and Cultures , 2001 .

[52]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  P. Kuppens,et al.  Affect dynamics in relation to depressive symptoms: variable, unstable or inert? , 2013, Emotion.

[54]  A. Schwerdtfeger,et al.  Predicting autonomic reactivity to public speaking: don't get fixed on self-report data! , 2004, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[55]  J. Russell Core affect and the psychological construction of emotion. , 2003, Psychological review.

[56]  Kallirroi Georgila,et al.  SimSensei kiosk: a virtual human interviewer for healthcare decision support , 2014, AAMAS.

[57]  Sergio Escalera,et al.  Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Fabien Ringeval,et al.  AVEC 2017: Real-life Depression, and Affect Recognition Workshop and Challenge , 2017, AVEC@ACM Multimedia.

[59]  James W. Pennebaker,et al.  The Psychology of Word Use in Depression Forums in English and in Spanish: Texting Two Text Analytic Approaches , 2008, ICWSM.

[60]  Carlos Busso,et al.  The ordinal nature of emotions , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).

[61]  Eva-Maria Rathner,et al.  The ecological validity of the autonomic-subjective response dissociation in repressive coping , 2016, Anxiety, stress, and coping.

[62]  P. Kuppens,et al.  Getting stuck in depression: The roles of rumination and emotional inertia , 2012, Cognition & emotion.

[63]  Fabien Ringeval,et al.  Discriminatively Trained Recurrent Neural Networks for Continuous Dimensional Emotion Recognition from Audio , 2016, IJCAI.

[64]  Björn W. Schuller,et al.  Cross-language acoustic emotion recognition: An overview and some tendencies , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[65]  Björn W. Schuller,et al.  openXBOW - Introducing the Passau Open-Source Crossmodal Bag-of-Words Toolkit , 2016, J. Mach. Learn. Res..

[66]  Björn W. Schuller,et al.  AVEC 2011-The First International Audio/Visual Emotion Challenge , 2011, ACII.

[67]  Jure Leskovec,et al.  Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health , 2016, TACL.

[68]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[69]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[70]  P. Kuppens,et al.  Emotional Inertia and Psychological Maladjustment , 2010, Psychological science.

[71]  G. Arbanas Diagnostic and Statistical Manual of Mental Disorders (DSM-5) , 2015 .

[72]  L. Lin,et al.  A concordance correlation coefficient to evaluate reproducibility. , 1989, Biometrics.

[73]  James A. Russell,et al.  Adaptation level and the affective appraisal of environments , 1984 .

[74]  Nicu Sebe,et al.  Affective multimodal human-computer interaction , 2005, ACM Multimedia.

[75]  Fabien Ringeval,et al.  AVEC 2018 Workshop and Challenge: Bipolar Disorder and Cross-Cultural Affect Recognition , 2018, AVEC@MM.

[76]  Qin Jin,et al.  Multi-modal Multi-cultural Dimensional Continues Emotion Recognition in Dyadic Interactions , 2018, AVEC@MM.

[77]  Fabien Ringeval,et al.  Speech-based Diagnosis of Autism Spectrum Condition by Generative Adversarial Network Representations , 2017, DH.

[78]  Fabien Ringeval,et al.  Summary for AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge , 2016, ACM Multimedia.

[79]  Björn W. Schuller,et al.  How Did You like 2017? Detection of Language Markers of Depression and Narcissism in Personal Narratives , 2018, INTERSPEECH.

[80]  Albert A. Rizzo,et al.  Automatic behavior descriptors for psychological disorder analysis , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).