Effect of vocal effort on spectral properties of vowels.

The effects of variations in vocal effort corresponding to common conversation situations on spectral properties of vowels were investigated. A database in which three degrees of vocal effort were suggested to the speakers by varying the distance to their interlocutor in three steps (close--0.4 m, normal--1.5 m, and far--6 m) was recorded. The speech materials consisted of isolated French vowels, uttered by ten naive speakers in a quiet furnished room. Manual measurements of fundamental frequency F0, frequencies, and amplitudes of the first three formants (F1, F2, F3, A1, A2, and A3), and on total amplitude were carried out. The speech materials were perceptually validated in three respects: identity of the vowel, gender of the speaker, and vocal effort. Results indicated that the speech materials were appropriate for the study. Acoustic analysis showed that F0 and F1 were highly correlated with vocal effort and varied at rates close to 5 Hz/dB for F0 and 3.5 Hz/dB for F1. Statistically F2 and F3 did not vary significantly with vocal effort. Formant amplitudes A1, A2, and A3 increased significantly; The amplitudes in the high-frequency range increased more than those in the lower part of the spectrum, revealing a change in spectral tilt. On the average, when the overall amplitude is increased by 10 dB, A1, A2, and A3 are increased by 11, 12.4, and 13 dB, respectively. Using "auditory" dimensions, such as the F1-F0 difference, and a "spectral center of gravity" between adjacent formants for representing vowel features did not reveal a better constancy of these parameters with respect to the variations of vocal effort and speaker. Thus a global view is evoked, in which all of the aspects of the signal should be processed simultaneously.

[1]  J C Junqua,et al.  The Lombard reflex and its role on human listeners and automatic speech recognizers. , 1993, The Journal of the Acoustical Society of America.

[2]  H. S. Gopal,et al.  A perceptual model of vowel recognition based on the auditory representation of American English vowels. , 1986, The Journal of the Acoustical Society of America.

[3]  Agaath M. C. Sluijter,et al.  Spectral balance as an acoustic correlate of linguistic stress. , 1996, The Journal of the Acoustical Society of America.

[4]  Björn Granström,et al.  Neglected dimensions in speech synthesis , 1992, Speech Commun..

[5]  Maria-Gabriella Di Benedetto Acoustic and perceptual evidence of a complex relation between F1 and F0 in determining vowel height , 1994 .

[6]  V. V. van Heuven,et al.  Spectral balance as a cue in the perception of linguistic stress. , 1997, The Journal of the Acoustical Society of America.

[7]  R. Schulman,et al.  Articulatory dynamics of loud and normal speech. , 1989, The Journal of the Acoustical Society of America.

[8]  Ann K. Syrdal,et al.  Aspects of a model of the auditory representation of american english vowels , 1985, Speech Commun..

[9]  J. T. Hogan,et al.  Vowel identification: orthographic, perceptual, and acoustic aspects. , 1982, The Journal of the Acoustical Society of America.

[10]  H. Traunmüller Perceptual dimension of openness in vowels. , 1981, The Journal of the Acoustical Society of America.