Real-time audiovisual laughter detection

Laughter detection is an essential aspect towards effective human-computer interaction. This work primarily addresses the problem of laughter detection in a real-time environment. We utilize annotated audio and visual data collected from a Kinect sensor to identify discriminative features for audio and video, separately. We show how the features can be used with classifiers such as support vector machines (SVM). The two modalities are then fused into a single output to form a decision. We test our setup by emulating real-time data with Kinect sensor, and compare the results with the offline version of the setup. Our results indicate that our laughter detection system gives a promising performance for a real-time human-computer interactions.

[1]  David A. van Leeuwen,et al.  Automatic discrimination between laughter and speech , 2007, Speech Commun..

[2]  Maja Pantic,et al.  Audiovisual Discrimination Between Speech and Laughter: Why and When Visual Information Might Help , 2011, IEEE Transactions on Multimedia.

[3]  Radoslaw Niewiadomski,et al.  Automated Laughter Detection From Full-Body Movements , 2016, IEEE Transactions on Human-Machine Systems.

[4]  Günther Palm,et al.  Spotting laughter in natural multiparty conversations: A comparison of automatic online and offline approaches using audiovisual data , 2012, TIIS.

[5]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[6]  K. Scherer,et al.  Affect bursts: dynamic patterns of facial expression. , 2011, Emotion.

[7]  Fabio Valente,et al.  The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism , 2013, INTERSPEECH.

[8]  Björn W. Schuller,et al.  Static and Dynamic Modelling for the Recognition of Non-verbal Vocalisations in Conversational Speech , 2008, PIT.

[9]  Maja Pantic,et al.  Audiovisual Detection of Laughter in Human-Machine Interaction , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[10]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Maja Pantic,et al.  Fusion of audio and visual cues for laughter detection , 2008, CIVR '08.

[12]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[13]  William Curran,et al.  Laughter Type Recognition from Whole Body Motion , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[14]  Engin Erzin,et al.  Affect burst detection using multi-modal cues , 2015, 2015 23nd Signal Processing and Communications Applications Conference (SIU).

[15]  D. Cosker Laughing, Crying, Sneezing and Yawning: Automatic Voice Driven Animation of Non-Speech Articulations , 2016 .

[16]  Maja Pantic,et al.  Decision-Level Fusion for Audio-Visual Laughter Detection , 2008, MLMI.

[17]  Tanja Schultz,et al.  Detection of Laughter-in-Interaction in Multichannel Close-Talk Microphone Recordings of Meetings , 2008, MLMI.

[18]  Albert Ali Salah,et al.  Recognition of Genuine Smiles , 2015, IEEE Transactions on Multimedia.

[19]  Sergio Escalera,et al.  Multi-modal laughter recognition in video conversations , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[20]  Athanasios Katsamanis,et al.  Tracking continuous emotional trends of participants during affective dyadic interactions using body language and speech information , 2013, Image Vis. Comput..

[21]  P. Ekman,et al.  The expressive pattern of laughter , 2001 .

[22]  William Curran,et al.  Perception and Automatic Recognition of Laughter from Whole-Body Motion: Continuous and Categorical Perspectives , 2015, IEEE Transactions on Affective Computing.

[23]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..