Recognizing prosody from the lips: Is it possible to extract prosodic focus from lip features?

The aim of this chapter is to examine the possibility of extracting prosodic information from lip features. We used two measurement techniques enabling automatic lip feature extraction to evaluate the “lip pattern” of prosodic focus in French. Two corpora with Subject-Verb-Object (SVO) sentences were designed. Four focus conditions (S, V, O or neutral) were elicited in a natural dialogue situation. In a first set of experiments, we recorded two speakers of French with front and profile video cameras. The speakers wore blue make-up and facial markers. In a second set we recorded five speakers with a 3D optical tracker. An analysis of the lip features showed that visible articulatory lip correlates of focus exist for all speakers. Two types of patterns were observed: absolute and differential. A potential outcome of this study is to provide criteria for automatic visual detection of prosodic focus from lip data.

[1]  C. Benoît,et al.  Effects of phonetic context on audio-visual intelligibility of French. , 1994, Journal of speech and hearing research.

[2]  R. Mccall Fundamental Statistics for Behavioral Sciences , 1986 .

[3]  B. Lewis,et al.  Disturbances in Speech , 1928 .

[4]  Vincent Pagel De l'utilisation d'informations acoustiques suprasegmentales en reconnaissance de la parole continue , 1999 .

[5]  J. Kelso,et al.  A qualitative dynamic analysis of reiterant speech production: phase portraits, kinematics, and dynamic modeling. , 1985, The Journal of the Acoustical Society of America.

[6]  Ann Cutler,et al.  Prosody in the Comprehension of Spoken Language: A Literature Review , 1997, Language and speech.

[7]  M. Pell,et al.  The neural bases of prosody: Insights from lesion studies and neuroimaging , 1999 .

[8]  Carl James,et al.  Eats, Shoots and Leaves: The Zero Tolerance Approach to Punctuation Lynne Truss: Accomodating Brocolli in the Cemetary: or why can’t anybody spell?. Vivian James , 2006 .

[9]  Alex Waibel,et al.  Prosody and speech recognition , 1988 .

[10]  A. Risberg,et al.  Speech , Music and Hearing Quarterly Progress and Status Report Prosody and speechreading , 2007 .

[11]  D. Dahan,et al.  Interspeaker Variability in Emphatic Accent Production in French , 1996, Language and speech.

[12]  L. Danon-Boileau,et al.  Grammaire de l'intonation : l'exemple du français , 1998 .

[13]  K. D. de Jong The supraglottal articulation of prominence in English: linguistic stress as localized hyperarticulation. , 1995, The Journal of the Acoustical Society of America.

[14]  John Kingston,et al.  Salient pitch cues in the perception of contrastive focus , 1994 .

[15]  G. H. Monrad‐Krohn,et al.  Dysprosody or altered melody of language. , 1947, Brain : a journal of neurology.

[16]  René Collier,et al.  Intonation and Its Uses. Melody in Grammar and Discourse , 1990 .

[17]  Donna Erickson,et al.  Articulation of Extreme Formant Patterns for Emphasized Vowels , 2002, Phonetica.

[18]  Pauline Welby French intonational rises and their role in speech seg mentation [sic] , 2003, INTERSPEECH.

[19]  Marion Dohen,et al.  Interaction of Audition and Vision for the Perception of Prosodic Contrastive Focus , 2009, Language and speech.

[20]  D. Bolinger Intonation and Its Uses , 1989 .

[21]  Elisabeth Selkirk,et al.  Phonology and Syntax: The Relation between Sound and Structure , 1984 .

[22]  K. D. Jong The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation , 1995 .

[23]  Philip R. Cohen,et al.  Intentions in Communication , 1991, CL.

[24]  Jacqueline Vaissière,et al.  The use of prosodic parameters in automatic speech recognition , 1988 .

[25]  A. D. Dominicis,et al.  Intonation Systems: A Survey of Twenty Languages , 1999 .

[26]  Issues in the perception of prosody , 1993 .

[27]  Anne Cutler,et al.  Stress and accent in language production and understanding , 1984 .

[28]  Jianwu Dang,et al.  Some articulatory and acoustic changes associated with emphasis in spoken English , 2000, INTERSPEECH.

[29]  Dorothy Mossman Thompson,et al.  On the Detection of Emphasis in Spoken Sentences by Means of Visual, Tactual, and Visual-Tactual Cues , 1934 .

[30]  M. Tabain Effects of prosodic boundary on /aC/ sequences: articulatory results. , 2003, The Journal of the Acoustical Society of America.

[31]  Eric Vatikiotis-Bateson,et al.  Rhythm type and articulatory dynamics in English, French and Japanese , 1993 .

[32]  W. V. Summers Effects of stress and final-consonant voicing on vowel production: articulatory and acoustic analyses. , 1987, The Journal of the Acoustical Society of America.

[33]  David House,et al.  Recognition of Prosodic Categories in Swedish: Rule Implementation , 2009 .

[34]  D. Gibbon,et al.  Intonation, accent, and rhythm : studies in discourse phonology , 1984 .

[35]  Jean-Luc Schwartz,et al.  Visual perception of contrastive focus in reiterant French speech , 2004, Speech Commun..

[36]  Mohamed Tahar Lallouache,et al.  Un poste "visage-parole" couleur : acquisition et traitement automatique des contours des lèvres , 1991 .

[37]  J. Hart,et al.  The role of intonation in speech perception , 1975 .

[38]  AN INVESTIGATION OF ARTICULATORY CORRELATES OF THE ACCENTUAL PHRASE IN FRENCH , 1999 .

[39]  C. Clifton,et al.  Focus, Accent, and Argument Structure: Effects on Language Comprehension , 1995, Language and speech.

[40]  Peter W. Jusczyk,et al.  Does sentential prosody help infants organize and remember speech information? , 1994, Cognition.

[41]  J. Harrington,et al.  Coarticulation and the accented/unaccented distinction: evidence from jaw movement data , 1995 .

[42]  Stefanie Shattuck-Hufnagel,et al.  A prosody tutorial for investigators of auditory sentence processing , 1996, Journal of psycholinguistic research.

[43]  H. Nølke Linguistique modulaire: de la forme au sens , 1994 .

[44]  Shari R. Baum,et al.  Sentence comprehension by Broca's aphasics: Effects of some suprasegmental variables , 1982, Brain and Language.

[45]  Abdellah Yousfi,et al.  The Centisecond Two Levels Hidden Semi Markov Model (CTLHSMM) , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).

[46]  Albert Di Cristo,et al.  Vers une modélisation de l'accentuation du français: première partie , 1999, Journal of French Language Studies.

[47]  P. Keating,et al.  Optical Phonetics and Visual Perception of Lexical and Phrasal Stress in English , 2009, Language and speech.

[48]  Marion Dohen,et al.  Pre-focal rephrasing, focal enhancement and postfocal deaccentuation in French , 2004, INTERSPEECH.

[49]  Mariapaola D'Imperio,et al.  Focus and tonal structure in Neapolitan Italian , 2001, Speech Commun..

[50]  Jean-Luc Schwartz,et al.  Audiovisual perception of contrastive focus in French , 2003, AVSP.

[51]  P. Boersma Praat : doing phonetics by computer (version 4.4.24) , 2006 .

[52]  Taehong Cho Prosodic strengthening and featural enhancement: evidence from acoustic and articulatory realizations of /a,i/ in English. , 2005, The Journal of the Acoustical Society of America.

[53]  L. Bernstein,et al.  Single-channel vibrotactile supplements to visual perception of intonation and stress. , 1989, The Journal of the Acoustical Society of America.

[54]  Sun-Ah Jun,et al.  A Phonological Model of French Intonation , 2000 .

[55]  J. Sadock Speech acts , 2007 .

[56]  Marion Dohen,et al.  Deixis prosodique multisensorielle : production et perception audiovisuelle de la focalisation contrastive en français , 2005 .

[57]  Paul Touati,et al.  Structures prosodiques du suédois et du français : profils temporels et configurations tonales , 1987 .

[58]  Carlos Gussenhoven,et al.  Testing the Reality of Focus Domains , 1983 .

[59]  D. Ingvar,et al.  Disturbances of speech prosody following right hemisphere infarcts , 1991, Acta neurologica Scandinavica.

[60]  Christian Abry,et al.  "Laws" for lips , 1986, Speech Commun..

[61]  Kenneth L. Pike On the Grammar of Intonation , 1965 .

[62]  A. Risberg,et al.  On the identification of intonation contours by hearing impaired listeners , 2007 .

[63]  C. Abry,et al.  Labialité et phonétique. Données fondamentales et études expérimentales sur la géométrie et la motricité labiales , 1980 .

[64]  M. Mesulam,et al.  Disturbances in prosody. A right-hemisphere contribution to language. , 1981, Archives of neurology.

[65]  Karen Bryan,et al.  Language prosody and the right hemisphere , 1989 .