A comparison of vowel normalization procedures for language variation research.

An evaluation of vowel normalization procedures for the purpose of studying language variation is presented. The procedures were compared on how effectively they (a) preserve phonemic information, (b) preserve information about the talker's regional background (or sociolinguistic information), and (c) minimize anatomical/physiological variation in acoustic representations of vowels. Recordings were made for 80 female talkers and 80 male talkers of Dutch. These talkers were stratified according to their gender and regional background. The normalization procedures were applied to measurements of the fundamental frequency and the first three formant frequencies for a large set of vowel tokens. The normalization procedures were evaluated through statistical pattern analysis. The results show that normalization procedures that use information across multiple vowels ("vowel-extrinsic" information) to normalize a single vowel token performed better than those that include only information contained in the vowel token itself ("vowel-intrinsic" information). Furthermore, the results show that normalization procedures that operate on individual formants performed better than those that use information across multiple formants (e.g., "formant-extrinsic" F2-F1).

[1]  Terrance M. Nearey,et al.  Evaluation of a strategy for automatic formant tracking , 2002 .

[2]  L. Gerstman Classification of self-normalized vowels , 1968 .

[3]  Louis ten Bosch,et al.  Speaker normalization for automatic speech recognition — An on-line approach , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[4]  R. Plomp,et al.  Frequency analysis of Dutch vowels from 50 male speakers. , 1973, The Journal of the Acoustical Society of America.

[5]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .

[6]  Roel Smits,et al.  An acoustic description of the vowels of Northern and Southern Standard Dutch. , 2004, The Journal of the Acoustical Society of America.

[7]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[8]  David Deterding,et al.  The Formants of Monophthong Vowels in Standard Southern British English Pronunciation , 1997, Journal of the International Phonetic Association.

[9]  Ann K. Syrdal,et al.  Aspects of a model of the auditory representation of american english vowels , 1985, Speech Commun..

[10]  S. F. Disner Evaluation of vowel normalization procedures. , 1980, The Journal of the Acoustical Society of America.

[11]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[12]  S. S. Stevens,et al.  Critical Band Width in Loudness Summation , 1957 .

[13]  G. Fant,et al.  Auditory analysis and perception of speech , 1975 .

[14]  H. S. Gopal,et al.  A perceptual model of vowel recognition based on the auditory representation of American English vowels. , 1986, The Journal of the Acoustical Society of America.

[15]  O. Amir,et al.  The Hebrew Vowel System: Raw and Normalized Acoustic Data , 2000, Language and speech.

[16]  Jonathan Harrington,et al.  Acoustic evidence for vowel change in New Zealand English , 2000, Language Variation and Change.

[17]  H. Wakita Normalization of vowels by vocal-tract length and its application to vowel identification , 1977 .

[18]  Robert Hagiwara,et al.  DIALECT VARIATION AND FORMANT FREQUENCY : THE AMERICAN ENGLISH VOWELS REVISITED , 1997 .

[19]  W. A. Ainsworth,et al.  Intrinsic and Extrinsic Factors in Vowel Judgements , 1975 .

[20]  H. V. D. Velde,et al.  Watching Dutch Change: A Real Time Study of Variation and Change in Standard Dutch Pronunciation , 1997 .

[21]  S. S. Stevens,et al.  The Relation of Pitch to Frequency: A Revised Scale , 1940 .

[22]  D. Broadbent,et al.  Information Conveyed by Vowels , 1957 .

[23]  J. T. Hogan,et al.  Vowel identification: orthographic, perceptual, and acoustic aspects. , 1982, The Journal of the Acoustical Society of America.

[24]  T. M. Nearey Phonetic feature systems for vowels , 1978 .

[25]  B. Lobanov Classification of Russian Vowels Spoken by Different Speakers , 1971 .

[26]  E. Zwicker,et al.  Analytical expressions for critical‐band rate and critical bandwidth as a function of frequency , 1980 .

[27]  E. Zwicker,et al.  Subdivision of the audible frequency range into critical bands , 1961 .

[28]  T. M. Nearey Static, dynamic, and relational properties in vowel perception. , 1989, The Journal of the Acoustical Society of America.

[29]  B. Lindblom,et al.  Modeling the judgment of vowel quality differences. , 1981, The Journal of the Acoustical Society of America.

[30]  J. D. Miller,et al.  Auditory-perceptual interpretation of the vowel. , 1989, The Journal of the Acoustical Society of America.

[31]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[32]  H. Traunmüller Analytical expressions for the tonotopic sensory scale , 1990 .

[33]  Hynek Hermansky,et al.  Low-dimensional representation of vowels based on all-pole modeling in the psychophysical domain , 1985, Speech Commun..

[34]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[35]  Per-Erik Nordström,et al.  Female and infant vocal tracts simulated from male area functions , 1977 .

[36]  James P. Stevens,et al.  Comment on Olson: Choosing a test statistic in multivariate analysis of variance. , 1979 .

[37]  Terrance M. Nearey Applications of generalized linear modeling to vowel data , 1992, ICSLP.

[38]  Patricia Martine Adank,et al.  Vowel Normalization. A Perceptual acoustic study of Dutch Vowels , 2003 .