Gender recognition from vocal source

Efficiency of automatic recognition of male and female voices based on solving the inverse problem for glottis area dynamics and for waveform of the glottal airflow volume velocity pulse is studied. The inverse problem is regularized through the use of analytical models of the voice excitation pulse and of the dynamics of the glottis area, as well as the model of one-dimensional glottal airflow. Parameters of these models and spectral parameters of the volume velocity pulse are considered. The following parameters are found to be most promising: the instant of maximum glottis area, the maximum derivative of the area, the slope of the spectrum of the glottal airflow volume velocity pulse, the amplitude ratios of harmonics of this spectrum, and the pitch. On the plane of the first two main components in the space of these parameters, an almost twofold decrease in the classification error relative to that for the pitch alone is attained. The male voice recognition probability is found to be 94.7%, and the female voice recognition probability is 95.9%.

[1]  F. Ingemann,et al.  Identification of the speaker's sex from voiceless fricatives. , 1968, The Journal of the Acoustical Society of America.

[2]  D. Childers,et al.  Gender recognition from speech. Part I: Coarse analysis. , 1991, The Journal of the Acoustical Society of America.

[3]  G. S. Neiman,et al.  Accuracy of listener judgments of perceived age relative to chronological age in adults. , 1990, Folia phoniatrica.

[4]  P. J. Price,et al.  Male and female voice source characteristics: Inverse filtering results , 1989, Speech Commun..

[5]  D. Childers,et al.  Two-channel speech analysis , 1986, IEEE Trans. Acoust. Speech Signal Process..

[6]  I. Titze Physiologic and acoustic differences between male and female voices. , 1989, The Journal of the Acoustical Society of America.

[7]  H M Hanson,et al.  Glottal characteristics of female speakers: acoustic correlates. , 1997, The Journal of the Acoustical Society of America.

[8]  Masaki Naito,et al.  Speaker clustering for speech recognition using vocal tract parameters , 2002, Speech Commun..

[9]  J. Perkell,et al.  Glottal airflow and transglottal air pressure measurements for male and female speakers in soft, normal, and loud voice. , 1988, The Journal of the Acoustical Society of America.

[10]  Andrea Paoloni,et al.  Subjective age estimation of telephonic voices , 2000, Speech Commun..

[11]  Yizhar Lavner,et al.  The effects of acoustic modifications on the identification of familiar voices speaking isolated vowels , 2000, Speech Commun..

[12]  John E. Markel,et al.  Linear Prediction of Speech , 1976, Communication and Cybernetics.

[13]  H. Hollien,et al.  Perception of the aging male voice. , 1969, Journal of speech and hearing research.

[14]  W. Fitch,et al.  Morphology and development of the human vocal tract: a study using magnetic resonance imaging. , 1999, The Journal of the Acoustical Society of America.

[15]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[16]  D G Childers,et al.  Gender recognition from speech. Part II: Fine analysis. , 1991, The Journal of the Acoustical Society of America.

[17]  J. Abbs,et al.  Vocal fold physiology : contemporary research and clinical issues , 1983 .

[18]  D G Childers,et al.  Vocal quality factors: analysis, synthesis, and perception. , 1991, The Journal of the Acoustical Society of America.

[19]  M. F. Schwartz,et al.  Identification of speaker sex from isolated, voiceless fricatives. , 1968, The Journal of the Acoustical Society of America.

[20]  Li Deng,et al.  Vocal‐tract length normalization for acoustic‐to‐articulatory mapping using neural networks , 1999 .

[21]  J. Bachorowski,et al.  Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech. , 1999, The Journal of the Acoustical Society of America.