Discrimination of speaker sex and size when glottal-pulse rate and vocal-tract length are controlled.

A recent study [Smith and Patterson, J. Acoust. Soc. Am. 118, 3177-3186 (2005)] demonstrated that both the glottal-pulse rate (GPR) and the vocal-tract length (VTL) of vowel sounds have a large effect on the perceived sex and age (or size) of a speaker. The vowels for all of the "different" speakers in that study were synthesized from recordings of the sustained vowels of one, adult male speaker. This paper presents a follow-up study in which a range of vowels were synthesized from recordings of four different speakers--an adult man, an adult woman, a young boy, and a young girl--to determine whether the sex and age of the original speaker would have an effect upon listeners' judgments of whether a vowel was spoken by a man, woman, boy, or girl, after they were equated for GPR and VTL. The sustained vowels of the four speakers were scaled to produce the same combinations of GPR and VTL, which covered the entire range normally encountered in every day life. The results show that listeners readily distinguish children from adults based on their sustained vowels but that they struggle to distinguish the sex of the speaker.

[1]  C. Darwin The Descent of Man and Selection in Relation to Sex: INDEX , 1871 .

[2]  ダーウィン チャールス,et al.  The descent of man and selection in relation to sex , 1907 .

[3]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[4]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[5]  Gunnar Fant,et al.  A note on vocal tract size factors and non-uniform f-pattern scalings , 1966 .

[6]  F. Ingemann,et al.  Identification of the speaker's sex from voiceless fricatives. , 1968, The Journal of the Acoustical Society of America.

[7]  M. F. Schwartz,et al.  Identification of speaker sex from isolated, whispered vowels. , 1968, The Journal of the Acoustical Society of America.

[8]  G. Fant Non-uniform vowel normalization , 1975 .

[9]  R O Coleman,et al.  A comparison of the contributions of two voice quality characteristics to the perception of maleness and femaleness in the voice. , 1976, Journal of speech and hearing research.

[10]  N. Lass,et al.  Speaker sex identification from voiced, whispered, and filtered isolated vowels. , 1974, The Journal of the Acoustical Society of America.

[11]  E. Morton On the Occurrence and Significance of Motivation-Structural Rules in Some Bird and Mammal Sounds , 1977, The American Naturalist.

[12]  N J Lass,et al.  Correlational study of speakers' heights, weights, body surface areas, and speaking fundamental frequencies. , 1978, The Journal of the Acoustical Society of America.

[13]  A. Holbrook,et al.  Fundamental frequency characteristics of young Black adults: spontaneous speaking and oral reading. , 1982, Journal of speech and hearing research.

[14]  D. Schaid,et al.  Androgen Stimulation and Laryngeal Development , 1985, The Annals of otology, rhinology, and laryngology.

[15]  I. Titze Physiologic and acoustic differences between male and female voices. , 1989, The Journal of the Acoustical Society of America.

[16]  H J Künzel,et al.  How Well Does Average Fundamental Frequency Correlate with Speaker Height and Weight? , 1989, Phonetica.

[17]  D G Childers,et al.  Gender recognition from speech. Part II: Fine analysis. , 1991, The Journal of the Acoustical Society of America.

[18]  D. Childers,et al.  Gender recognition from speech. Part I: Coarse analysis. , 1991, The Journal of the Acoustical Society of America.

[19]  H. Hollien,et al.  Longitudinal research on adolescent voice change in males. , 1994, The Journal of the Acoustical Society of America.

[20]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[21]  R. P. Fahey,et al.  On explaining certain male-female differences in the phonetic realization of vowel categories , 1996 .

[22]  The Identification of a Speaker's Sex from Synthesized Vowels , 1998 .

[23]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[24]  J. Bachorowski,et al.  Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech. , 1999, The Journal of the Acoustical Society of America.

[25]  K. Johnson,et al.  Formants of children, women, and men: the effects of vocal intensity variation. , 1999, The Journal of the Acoustical Society of America.

[26]  W. Fitch,et al.  Morphology and development of the human vocal tract: a study using magnetic resonance imaging. , 1999, The Journal of the Acoustical Society of America.

[27]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[28]  C. Darwin,et al.  Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. , 2003, The Journal of the Acoustical Society of America.

[29]  Terrance M. Nearey,et al.  Frequency Shifts and Vowel Identification , 2003 .

[30]  Julio González,et al.  Formant frequencies and body size of speaker: a weak relationship in adult humans , 2004, J. Phonetics.

[31]  Diane Kewley-Port,et al.  STRAIGHT: A new speech synthesizer for vowel formant discrimination , 2004 .

[32]  Roy D Patterson,et al.  The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size, sex, and age. , 2005, The Journal of the Acoustical Society of America.

[33]  Hideki Kawahara,et al.  Underlying Principles of a High-quality Speech Manipulation System STRAIGHT and Its Application to Speech Segregation , 2005, Speech Separation by Humans and Machines.

[34]  Roy D Patterson,et al.  Discrimination of speaker size from syllable phrases. , 2005, Journal of the Acoustical Society of America.

[35]  Richard E. Turner,et al.  The processing and perception of size information in speech sounds. , 2005, The Journal of the Acoustical Society of America.

[36]  Drew Rendall,et al.  Reliable but weak voice‐formant cues to body size in men but not women , 2005 .

[37]  D. Rendall,et al.  Pitch (F0) and formant profiles of human vowels and vowel-like baboon grunts: the role of vocalizer body size and voice-acoustic allometry. , 2005, The Journal of the Acoustical Society of America.

[38]  Roy D. Patterson,et al.  Vowel normalisation: Time-domain processing of the internal dynamics of speech , 2006 .

[39]  Terrance M. Nearey,et al.  Effects of frequency shifts on perceived naturalness and gender information in speech , 2006, INTERSPEECH.

[40]  Thomas C. Walters,et al.  Role of glottal‐pulse rate, vocal‐tract length, and original talker upon judgements of speaker sex and age , 2007 .