The INTERSPEECH 2019 Computational Paralinguistics Challenge: Styrian Dialects, Continuous Sleepiness, Baby Sounds & Orca Activity

The INTERSPEECH 2019 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Styrian Dialects Sub-Challenge, three types of Austrian-German dialects have to be classified; in the Continuous Sleepiness SubChallenge, the sleepiness of a speaker has to be assessed as regression problem; in the Baby Sound Sub-Challenge, five types of infant sounds have to be classified; and in the Orca Activity Sub-Challenge, orca sounds have to be detected. We describe the Sub-Challenges and baseline feature extraction and classifiers, which include data-learnt (supervised) feature representations by the ‘usual’ ComParE and BoAW features, and deep unsupervised representation learning using the AUDEEP toolkit.

[1]  Björn W. Schuller,et al.  Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks , 2018 .

[2]  Björn W. Schuller,et al.  openXBOW - Introducing the Passau Open-Source Crossmodal Bag-of-Words Toolkit , 2016, J. Mach. Learn. Res..

[3]  Björn Schuller,et al.  Sequence to Sequence Autoencoders for Unsupervised Representation Learning from Audio , 2017, DCASE.

[4]  Fabien Ringeval,et al.  At the Border of Acoustics and Linguistics: Bag-of-Audio-Words for the Recognition of Emotions in Speech , 2016, INTERSPEECH.

[5]  Björn W. Schuller,et al.  iHEARu-PLAY: Introducing a game for crowdsourced data collection for affective computing , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[6]  Myung Jong Kim,et al.  Robust sound event classification using LBP-HOG based bag-of-audio-words feature representation , 2015, INTERSPEECH.

[7]  Georgina Brown Y-ACCDIST: An Automatic Accent Recognition System for Forensic Applications , 2014 .

[8]  Elmar Nöth,et al.  Acoustic-Prosodic Characteristics of Sleepy Speech - Between Performance and Interpretation , 2014 .

[9]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[10]  Björn Schuller,et al.  Computational Paralinguistics , 2013 .

[11]  Fabio Valente,et al.  The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism , 2013, INTERSPEECH.

[12]  K. Scherer,et al.  On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common , 2013, Front. Psychol..

[13]  O. Köster,et al.  The tell-tale accent: Identification of regionally marked speech in German telephone conversations by forensic phoneticians , 2012 .

[14]  Andrew Rosenberg,et al.  Classifying Skewed Data: Importance Weighting to Optimize Average Recall , 2012, INTERSPEECH.

[15]  Björn W. Schuller,et al.  Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[16]  J. Gilkerson,et al.  Assessing Children’s Home Language Environments Using Automatic Speech Recognition Technology , 2011 .

[17]  Colin M. Shapiro,et al.  Karolinska Sleepiness Scale (KSS) , 2011 .

[18]  Björn W. Schuller,et al.  The INTERSPEECH 2011 Speaker State Challenge , 2011, INTERSPEECH.

[19]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[20]  Martin Golz,et al.  Acoustic sleepiness detection: Framework and validation of a speech-adapted pattern recognition approach , 2009, Behavior research methods.

[21]  G. Gudmundsson [Sleepiness and driving]. , 2008, Laeknabladid.

[22]  J. Taeldeman Dialect Change: The influence of urban centres on the spatial diffusion of dialect phenomena , 2005 .

[23]  S. Melamed,et al.  Excessive daytime sleepiness and risk of occupational injuries in non-shift daytime workers. , 2002, Sleep.

[24]  Helena Symonds,et al.  Displacement of Orcinus orca (L.) by high amplitude sound in British Columbia, Canada , 2002 .

[25]  J. Barlow,et al.  Acoustic detections of singing humpback whales (Megaptera novaeangliae) in the eastern North Pacific during their northbound migration. , 1999, The Journal of the Acoustical Society of America.

[26]  K. Stafford,et al.  Long-range acoustic detection and localization of blue whale calls in the northeast Pacific Ocean. , 1998, The Journal of the Acoustical Society of America.

[27]  D. Oller,et al.  Late onset canonical babbling: a possible early marker of abnormal development. , 1998, American journal of mental retardation : AJMR.

[28]  L. Lin,et al.  A concordance correlation coefficient to evaluate reproducibility. , 1989, Biometrics.

[29]  W C Cummings,et al.  Passive acoustic location of bowhead whales in a population census off Point Barrow, Alaska. , 1985, The Journal of the Acoustical Society of America.

[30]  P. Wiesinger Mundart und Geschichte in der Steiermark : ein Beitrag zur Dialektgeographie eines österreichischen Bundeslandes , 1967 .

[31]  Österreichische Akademie der Wissenschaften,et al.  Historische Lautgeographie des gesamtbairischen Dialektraumes , 1957 .

[32]  K. Pearson VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.