The role of glottal pulse rate and vocal tract length in the perception of speaker identity

In natural speech, for a given speaker, vocal tract length (VTL) is effectively fixed whereas glottal pulse rate (GPR) is varied to indicate prosodic distinctions. This suggests that VTL will be a more reliable cue for identifying a speaker than GPR. It also suggests that listeners will accept larger changes in GPR before perceiving speaker change. We measured the effect of GPR and VTL on the perception of a speaker difference, and found that listeners hear different speakers given a VTL difference of 25%, but they require a GPR difference of 45%. Index Terms: speaker identity, glottal pulse rate, vocal tract length

[1]  Roy D Patterson,et al.  Discrimination of speaker size from syllable phrases. , 2005, Journal of the Acoustical Society of America.

[2]  Richard E. Turner,et al.  The processing and perception of size information in speech sounds. , 2005, The Journal of the Acoustical Society of America.

[3]  Hideki Kawahara,et al.  Underlying Principles of a High-quality Speech Manipulation System STRAIGHT and Its Application to Speech Segregation , 2005, Speech Separation by Humans and Machines.

[4]  G. Studebaker A "rationalized" arcsine transform. , 1985, Journal of speech and hearing research.

[5]  Thomas C. Walters,et al.  Discrimination of speaker sex and size when glottal-pulse rate and vocal-tract length are controlled. , 2007, The Journal of the Acoustical Society of America.

[6]  Roy D Patterson,et al.  The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size, sex, and age. , 2005, The Journal of the Acoustical Society of America.

[7]  Tohru Takagi,et al.  Acoustic parameters of voice individuality and voice-quality control by analysis-synthesis method , 1991, Speech Commun..

[8]  Yi Xu,et al.  Encoding Emotions in Speech with the Size Code , 2009, Phonetica.

[9]  C. Darwin,et al.  Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. , 2003, The Journal of the Acoustical Society of America.

[10]  Shinji Maeda,et al.  Fundamental frequency histograms measured by electroglottography during speech: a pilot study for standardization. , 2006, Journal of voice : official journal of the Voice Foundation.

[11]  Roy D Patterson,et al.  The interaction of vocal characteristics and audibility in the recognition of concurrent syllables. , 2009, The Journal of the Acoustical Society of America.