Integrating sequence information in the audio-visual detection of word prominence in a human-machine interaction scenario
暂无分享,去创建一个
[1] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[2] Elizabeth Shriberg,et al. Spontaneous speech: how people really talk and why engineers should care , 2005, INTERSPEECH.
[3] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..
[4] Julia Hirschberg,et al. Detecting Pitch Accents at the Word, Syllable and Vowel Level , 2009, NAACL.
[5] Andreas Stolcke,et al. Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.
[6] Martin Heckmann,et al. Inter-speaker variability in audio-visual classification of word prominence , 2013, INTERSPEECH.
[7] Julia Hirschberg,et al. Characterizing and Predicting Corrections in Spoken Dialogue Systems , 2006, CL.
[8] Gina-Anne Levow,et al. Automatic Prosodic Labeling with Conditional Random Fields and Rich Acoustic Features , 2008, IJCNLP.
[9] Yasemin Altun,et al. Using Conditional Random Fields to Predict Pitch Accents in Conversational Speech , 2004, ACL.
[10] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[11] Mari Ostendorf,et al. Error-correction detection and response generation in a spoken dialogue system , 2005, Speech Commun..
[12] Mattias Heldner,et al. On the reliability of overall intensity and spectral emphasis as acoustic correlates of focal accents in Swedish , 2003, J. Phonetics.
[13] Gina-Anne Levow,et al. Context in multi-lingual tone and pitch accent recognition , 2005, INTERSPEECH.
[14] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[15] Marion Dohen,et al. Pre-focal rephrasing, focal enhancement and postfocal deaccentuation in French , 2004, INTERSPEECH.
[16] Gina-Anne Levow,et al. Identifying local corrections in human-computer dialogue , 2004, INTERSPEECH.
[17] Julia Hirschberg,et al. Corrections in spoken dialogue systems , 2000, INTERSPEECH.
[18] Eric Fosler-Lussier,et al. Conditional Random Fields in Speech, Audio, and Language Processing , 2013, Proceedings of the IEEE.
[19] Julia Hirschberg,et al. Prosodic and other cues to speech recognition failures , 2004, Speech Commun..
[20] Yi Xu,et al. Phonetic realization of focus in English declarative intonation , 2005, J. Phonetics.
[21] Martin Heckmann,et al. Audio-visual Evaluation and Detection of Word Prominence in a Human-Machine Interaction Scenario , 2012, INTERSPEECH.
[22] Andrew Rosenberg,et al. Automatic detection and classification of prosodic events , 2009 .