Automatic Phonetic Transcription of Laughter and Its Application to Laughter Synthesis

In this paper, automatic phonetic transcription of laughter is achieved with the help of Hidden Markov Models (HMMs). The models are evaluated in a speaker-independent way. Several measures to evaluate the quality of the transcriptions are discussed, some focusing on the recognized sequences (without paying attention to the segmentation of the phones), other only taking into account the segmentation boundaries (without involving the phonetic labels). Although the results are far from perfect recognition, it is shown that using this kind of automatic transcriptions does not impair too much the naturalness of laughter synthesis. The paper opens interesting perspectives in automatic laughter analysis as well as in laughter synthesis, as it will enable faster developments of laughter synthesis on large sets of laughter data.

[1]  J. Fleiss,et al.  The measurement of interrater agreement , 2004 .

[2]  Thierry Dutoit,et al.  The AVLaughterCycle Database , 2010, LREC.

[3]  P. Ladefoged A course in phonetics , 1975 .

[4]  Alfred W. Kaszniak,et al.  Emotions, Qualia, and Consciousness , 2001 .

[5]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[6]  Luis A. Hernández Gómez,et al.  Automatic phonetic segmentation , 2003, IEEE Trans. Speech Audio Process..

[7]  Wallace L. Chafe,et al.  The Importance of Not Being Earnest – The Feeling behind Laughter and Humor , 2011, Phonetica.

[8]  Shrikanth Narayanan,et al.  Automatic acoustic synthesis of human-like laughter. , 2007, The Journal of the Acoustical Society of America.

[9]  J. Fleiss Statistical methods for rates and proportions , 1974 .

[10]  Thierry Dutoit,et al.  A Phonetic Analysis of Natural Laughter, for Use in Automatic Laughter Processing Systems , 2011, ACII.

[11]  Abeer Alwan,et al.  Joint Robust Voicing Detection and Pitch Estimation Based on Residual Harmonics , 2019, INTERSPEECH.

[12]  David A. van Leeuwen,et al.  Automatic discrimination between laughter and speech , 2007, Speech Commun..

[13]  Steve Young,et al.  The HTK book version 3.4 , 2006 .

[14]  Unto K. Laine,et al.  An improved speech segmentation quality measure: the r-value , 2009, INTERSPEECH.

[15]  J. Bachorowski,et al.  The acoustic features of human laughter. , 2001, The Journal of the Acoustical Society of America.

[16]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[17]  P. Ekman,et al.  The expressive pattern of laughter , 2001 .

[18]  Thierry Dutoit,et al.  Objective Study of Sensor Relevance for Automatic Cough Detection , 2013, IEEE Journal of Biomedical and Health Informatics.

[19]  J. Trouvain,et al.  IMITATING CONVERSATIONAL LAUGHTER WITH AN ARTICULATORY SPEECH SYNTHESIZER , 2007 .

[20]  J. Trouvain Segmenting Phonetic Units in Laughter , 2003 .

[21]  Daniel P. W. Ellis,et al.  Laughter Detection in Meetings , 2004 .

[22]  Phillip J. Glenn Laughter in Interaction , 2003 .

[23]  Thierry Dutoit,et al.  Evaluation of HMM-based laughter synthesis , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Steve Young,et al.  The HTK hidden Markov model toolkit: design and philosophy , 1993 .

[25]  W. Chafe The Importance of Not Being Earnest , 2007 .

[26]  Thierry Dutoit,et al.  Phase-based information for voice pathology detection , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).