This study examined Auditory (A) and Visual (V) speech (speech-related head and face movement) as a function of noise environment. Measures of AV speech were recorded for 3 males and 1 female for 10 sentences spoken in quiet as well as four styles of background noise (Lombard speech). Auditory speech was analyzed in terms of overall intensity, duration, spectral tilt and prosodic parameters employing Fujisaki model based parameterizations of F0 contours. Visual speech was analyzed in terms of Principal Components (PC) of head and face movement. Compared to speech in quiet, Lombard speech was louder, of longer duration, had more energy at higher frequencies (particularly with babble speech) and had greater amplitude mean accent and phrase commands. Visual Lombard speech showed greater influence of the PCs associated with jaw and mouth movement, face expansion and contraction and head rotation (pitch). Lombard speech showed increased AV speech correlations between RMS speech intensity and the PCs that involved jaw and mouth movement. A similar increased correlation occurred for intensity and head rotation (pitch). For Lombard speech, all talkers showed an increased correlation between F0 and head translation (raising and lowering). Increased F0 correlations for other head movements were more idiosyncratic. These findings suggest that the relationships underlying Audio-Visual speech perception differ depending on how that speech was produced
[1]
Hansjörg Mixdorff,et al.
A novel approach to the fully automatic extraction of Fujisaki model parameters
,
2000,
2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[2]
B. Silverman,et al.
Functional Data Analysis
,
1997
.
[3]
B. Silverman,et al.
Functional Data Analysis
,
1997
.
[4]
H. Pick,et al.
Inhibiting the Lombard effect.
,
1989,
The Journal of the Acoustical Society of America.
[5]
Jean-Claude Junqua,et al.
The influence of acoustics on speech production: A noise-induced stress phenomenon known as the Lombard reflex
,
1996,
Speech Commun..
[6]
Jeesun Kim,et al.
Hearing Foreign Voices: Does Knowing What is Said Affect Visual-Masked-Speech Detection?
,
2003,
Perception.
[7]
S R Garber,et al.
The Lombard sign as a function of age and task.
,
1982,
Journal of speech and hearing research.
[8]
S Tonkinson,et al.
The Lombard effect in choral singing.
,
1994,
Journal of voice : official journal of the Voice Foundation.
[9]
L. M. Potash,et al.
Noise-induced changes in calls of the Japanese quail
,
1972
.
[10]
D B Moody,et al.
Regulation of voice amplitude by the monkey.
,
1975,
The Journal of the Acoustical Society of America.