Detection of laughter in children's speech using spectral and prosodic acoustic features

Laughter is an important para-linguistic cue that can be useful in gauging the affective state of the speaker. In this paper, we present an approach to detecting laughter in children’s speech using acoustic features in the spectral and prosodic domains. Feature selection was performed using the information gain-based technique and a speaker-independent validation using a support vector machine (SVM), an accuracy of 94.43% was observed, which was a 12.48% absolute improvement over the baseline result of 81.95%. For us to explore generalization properties, the models of speech and laughter were tested on a completely different database of adult-child interactions known as the Multimodal Dyadic Behavior Dataset (MDBD). The accuracy using the earlier trained models was 70.58%. Even though the children in this database were toddlers (less than three years old), the results suggest that the predictive power of the selected features generalizes well to different forms of children’s laughter.

[1]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  Elmar Nöth,et al.  Associating children's non-verbal and verbal behaviour: Body movements, emotions, and laughter in a human-robot interaction , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  J. Cadzow Maximum Entropy Spectral Analysis , 2006 .

[5]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[6]  Ian Vince McLoughlin,et al.  Line spectral pairs , 2008, Signal Process..

[7]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[8]  Sharynne McLeod,et al.  The International Guide to Speech Acquisition , 2007 .

[9]  James M. Rehg,et al.  Decoding Children's Social Behavior , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  K. Pike,et al.  The intonation of American English , 1946 .

[11]  Daniel P. W. Ellis,et al.  Laughter Detection in Meetings , 2004 .

[12]  W. Stone,et al.  Laughter Differs in Children with Autism: An Acoustic Analysis of Laughs Produced by Children With and Without the Disorder , 2009, Journal of autism and developmental disorders.

[13]  Sheri Hunnicutt,et al.  Acoustic analysis of laughter , 1992, ICSLP.

[14]  David A. van Leeuwen,et al.  Automatic detection of laughter , 2005, INTERSPEECH.

[15]  Caroline Menezes,et al.  Phonetic and acoustic differences in child and adult laughter , 2011 .