Investigating syllabic prominence with Conditional Random Fields and Latent-Dynamic Conditional Random Fields

The present study performs an investigation on several issues concerning the automatic detection of prominences. Its aim is to offer a better understanding of the prominence phenomenon in order to be able to improve existent prominence detection systems. The study is threefold: first, the presence of hidden dynamics in the sequence of prominent and non-prominent syllables is tested by comparing results obtained with CRFs and LDCRFs. Second, the size of the context to be taken into account when determining prominence was examined and third, a new set of features was investigated. The obtained results show that LDCRFs systematically outperform CRFs, that a context of three syllables is generally sufficient for prominence detection and that syllable length is a useful feature to include. Also, new features concerning pitch movements we introduced can substitute adequately heuristic measures used in previous works.

[1]  Anne Lacheret,et al.  A corpus-based learning method for prominence detection in spontaneous speech , 2009 .

[2]  Piet Mertens,et al.  The Prosogram: Semi-Automatic Transcription of Prosody Based on a Tonal Perception Model , 2004 .

[3]  Harald Höge,et al.  SPEECON - Speech Data for Consumer Devices , 2000, LREC.

[4]  Shrikanth Narayanan,et al.  Detecting prominence in conversational speech: pitch accent, givenness and focus , 2008, Speech Prosody 2008.

[5]  Fabio Tamburini,et al.  Reliable prominence identification in English spontaneous speech , 2006, Speech Prosody 2006.

[6]  Yasemin Altun,et al.  Using Conditional Random Fields to Predict Pitch Accents in Conversational Speech , 2004, ACL.

[7]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8]  Rosaria Silipo,et al.  AUTOMATIC TRANSCRIPTION OF PROSODIC STRESS FOR SPONTANEOUS ENGLISH DISCOURSE , 1999 .

[9]  Antonio Origlia,et al.  On the Use of the Rhythmogram for Automatic Syllabic Prominence Detection , 2011, INTERSPEECH.

[10]  Antonio Origlia,et al.  A Divide et impera Algorithm for Optimal Pitch Stylization , 2011, INTERSPEECH.

[11]  Bogdan Ludusan,et al.  Pitch behavior detection for automatic prominence recognition , 2010 .

[12]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  John Hart,et al.  A Perceptual Study of Intonation , 1990 .