A simple approach to non-uniform vowel normalization

In this paper, we present results of non-uniform vowel normalization and show that the frequency-warping necessary to do nonuniform vowel nonnalization is similar to the mel-scale. We compare our methods to Fant's non-uniform vowel normalization method and show that with proposed frequency warping approach we can achieve similar performance without any knowledge of the spoken vowel and the fonnant number. The proposed approach is motivated by a desire to perform non-uniform speaker normalization in automatic speech recognition systems. We also present results of a more comprehensive study of our earlier work on non-uniform scaling which again shows that mel-scale is the appropriate warping function. All the results in this paper are based on data from Peterson & Barney and Hillenbrand et al. vowel databases.

[1]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[2]  G. Fant Non-uniform vowel normalization , 1975 .

[3]  Leon Cohen,et al.  Frequency-warping and speaker-normalization , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  S. S. Stevens,et al.  The Relation of Pitch to Frequency: A Revised Scale , 1940 .

[5]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[6]  Srinivasan Umesh,et al.  Non-uniform scaling based speaker normalization , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Leon Cohen,et al.  Frequency-warping in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[8]  S. Howard Bartley,et al.  The relation of pitch to frequency. , 1950 .