Fine-grain voice strength estimation from vowel spectral cues

This study investigates the possibility to recover the voice strength, i.e. the sound level produced by the speaker, from the signal recorded. The dataset consists of a set of isolated vowels (720 tokens) recorded in a situation where two interlocutors interacted orally at a distance comprised between 0.40 and 6 meters, in a furnished room. For each token, voice strength is measured at the intensity peak, and several sets of acoustic cues are extracted from the signal spectrum, after frequency weighting and intensity normalization. In the first phase, the tokens are grouped into increasing voice strength categories. Discriminant Analysis produces a classifier which takes into account all the signal dimensions implicitly coded in the set of cues. In the second phase, the cues of a new token are given to the classifier, which in turn produces its distances to the groups, providing the basis for estimating the unknown voice strength. The quality of the process is evaluated either in self-consistency mode or by cross-validation, i.e. by comparing the estimate with the value initially measured on the same token. The statistical margin of error is quite low, of the order of 3 dB, depending on the sets of cues used.

[1]  Maëva Garnier,et al.  Communiquer en environnement bruyant : de l’adaptation jusqu’au forçage vocal , 2007 .

[2]  Christophe d’Alessandro,et al.  Voice Source Parameters and Prosodic Analysis , 2006 .

[3]  Nathalie Henrich Bernardoni,et al.  The spectrum of glottal flow models , 2006 .

[4]  W. A. Mvnso,et al.  Loudness , Its Definition , Measurement and Calculation , 2004 .

[5]  J C Junqua,et al.  The Lombard reflex and its role on human listeners and automatic speech recognizers. , 1993, The Journal of the Acoustical Society of America.

[6]  K. Johnson,et al.  Formants of children, women, and men: the effects of vocal intensity variation. , 1999, The Journal of the Acoustical Society of America.

[7]  J. Liénard,et al.  Effect of vocal effort on spectral properties of vowels. , 1999, The Journal of the Acoustical Society of America.

[8]  J. Liljencrants,et al.  Dept. for Speech, Music and Hearing Quarterly Progress and Status Report a Four-parameter Model of Glottal Flow , 2022 .

[9]  H M Hanson,et al.  Glottal characteristics of female speakers: acoustic correlates. , 1997, The Journal of the Acoustical Society of America.

[10]  H. Traunmüller,et al.  Acoustic effects of variation in vocal effort by men, women, and children. , 2000, The Journal of the Acoustical Society of America.

[11]  B. Doval,et al.  Glottal open quotient in singing: measurements and correlation with laryngeal mechanisms, vocal intensity, and fundamental frequency. , 2005, The Journal of the Acoustical Society of America.