Modeling the voice source in terms of spectral slopes.

A psychoacoustic model of the voice source spectrum is proposed. The model is characterized by four spectral slope parameters: the difference in amplitude between the first two harmonics (H1-H2), the second and fourth harmonics (H2-H4), the fourth harmonic and the harmonic nearest 2 kHz in frequency (H4-2 kHz), and the harmonic nearest 2 kHz and that nearest 5 kHz (2 kHz-5 kHz). As a step toward model validation, experiments were conducted to establish the acoustic and perceptual independence of these parameters. In experiment 1, the model was fit to a large number of voice sources. Results showed that parameters are predictable from one another, but that these relationships are due to overall spectral roll-off. Two additional experiments addressed the perceptual independence of the source parameters. Listener sensitivity to H1-H2, H2-H4, and H4-2 kHz did not change as a function of the slope of an adjacent component, suggesting that sensitivity to these components is robust. Listener sensitivity to changes in spectral slope from 2 kHz to 5 kHz depended on complex interactions between spectral slope, spectral noise levels, and H4-2 kHz. It is concluded that the four parameters represent non-redundant acoustic and perceptual aspects of voice quality.

[1]  S. S. Stevens A scale for the measurement of a psychological magnitude: loudness. , 1936 .

[2]  J. Flanagan A Difference Limen for Vowel Formant Frequency , 1955 .

[3]  J. Flanagan Note on the Design of “Terminal‐Analog” Speech Synthesizers , 1957 .

[4]  J. Flanagan,et al.  Difference limen for formant amplitude. , 1957, The Journal of speech and hearing disorders.

[5]  P. B. Carr,et al.  Long‐Term Larynx‐Excitation Spectra , 1964 .

[6]  H. Levitt Transformed up-down methods in psychoacoustics. , 1971, The Journal of the Acoustical Society of America.

[7]  B. Moore Frequency difference limens for short-duration tones. , 1973, The Journal of the Acoustical Society of America.

[8]  I Maddieson,et al.  Digital inverse filtering for linguistic research. , 1987, Journal of speech and hearing research.

[9]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[10]  G. de Krom A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals. , 1993, Journal of speech and hearing research.

[11]  Guus de Krom,et al.  A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals , 1993 .

[12]  H M Hanson,et al.  Glottal characteristics of female speakers: acoustic correlates. , 1997, The Journal of the Acoustical Society of America.

[13]  J Kreiman,et al.  Validity of rating scale measures of voice quality. , 1998, The Journal of the Acoustical Society of America.

[14]  Peter Ladefoged,et al.  Phonation types: a cross-linguistic overview , 2001, J. Phonetics.

[15]  Rahul Shrivastav,et al.  Objective measures of breathy voice quality obtained using an auditory model. , 2003, The Journal of the Acoustical Society of America.

[16]  Jody Kreiman,et al.  Perception of aperiodicity in pathological voice. , 2005, The Journal of the Acoustical Society of America.

[17]  Rahul Shrivastav,et al.  Some difference limens for the perception of breathiness. , 2006, The Journal of the Acoustical Society of America.

[18]  Jody Kreiman,et al.  Measures of the glottal source spectrum. , 2007, Journal of speech, language, and hearing research : JSLHR.

[19]  Rahul Shrivastav,et al.  A computational model to predict changes in breathiness resulting from variations in aspiration noise level. , 2010, Journal of voice : official journal of the Voice Foundation.

[20]  Jody Kreiman,et al.  Integrated software for analysis and synthesis of voice quality , 2010, Behavior research methods.

[21]  Christina M. Esposito The effects of linguistic experience on the perception of phonation , 2010, J. Phonetics.

[22]  Jody Kreiman,et al.  Modeling overall voice quality with a small set of acoustic parameters. , 2011 .

[23]  Perceptual importance of the voice source spectrum from H2 to 2 kHz , 2011 .

[24]  Jason Bishop,et al.  Perception of pitch location within a speaker's range: fundamental frequency, voice quality and speaker sex. , 2012, The Journal of the Acoustical Society of America.

[25]  Jody Kreiman,et al.  Perceptual interaction of the harmonic source and noise in voice. , 2012, The Journal of the Acoustical Society of America.

[26]  Jody Kreiman,et al.  Perceptual sensitivity to a model of the source spectrum , 2013 .

[27]  Jody Kreiman,et al.  Voice quality and tone identification in White Hmong. , 2013, The Journal of the Acoustical Society of America.

[28]  Robin A. Samlan,et al.  Toward a unified theory of voice production and perception , 2014, Loquens.

[29]  Jody Kreiman,et al.  Comparing Measures of Voice Quality From Sustained Phonation and Continuous Speech. , 2016, Journal of speech, language, and hearing research : JSLHR.