Sources of listener disagreement in voice quality assessment.

Traditional interval or ordinal rating scale protocols appear to be poorly suited to measuring vocal quality. To investigate why this might be so, listeners were asked to classify pathological voices as having or not having different voice qualities. It was reasoned that this simple task would allow listeners to focus on the kind of quality a voice had, rather than how much of a quality it possessed, and thus might provide evidence for the validity of traditional vocal qualities. In experiment 1, listeners judged whether natural pathological voice samples were or were not primarily breathy and rough. Listener agreement in both tasks was above chance, but listeners agreed poorly that individual voices belonged in particular perceptual classes. To determine whether these results reflect listeners' difficulty agreeing about single perceptual attributes of complex stimuli, listeners in experiment 2 classified natural pathological voices and synthetic stimuli (varying in f0 only) as low pitched or not low pitched. If disagreements derive from difficulties dividing an auditory continuum consistently, then patterns of agreement should be similar for both kinds of stimuli. In fact, listener agreement was significantly better for the synthetic stimuli than for the natural voices. Difficulty isolating single perceptual dimensions of complex stimuli thus appears to be one reason why traditional unidimensional rating protocols are unsuited to measuring pathologic voice quality. Listeners did agree that a few aphonic voices were breathy, and that a few voices with prominent vocal fry and/or interharmonics were rough. These few cases of agreement may have occurred because the acoustic characteristics of the voices in question corresponded to the limiting case of the quality being judged. Values of f0 that generated listener agreement in experiment 2 were more extreme for natural than for synthetic stimuli, consistent with this interpretation.

[1]  Daniel Jones An outline of English phonetics , 1956 .

[2]  Domis E. Pluggé “Voice qualities” in oral interpretation , 1942 .

[3]  Giles Wilkeson Gray The “voice qualities”; in the history of elocution , 1943 .

[4]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[5]  G. Fairbanks Voice and articulation drillbook , 1960 .

[6]  H. Hollien,et al.  On the nature of vocal fry. , 1966, Journal of speech and hearing research.

[7]  R. W. Wendahl,et al.  Some parameters of auditory roughness. , 1966, Folia phoniatrica.

[8]  H. Hollien,et al.  Perceptual differentiation of vocal fry and harshness. , 1968, Journal of speech and hearing research.

[9]  H Hollien,et al.  Perceptual study of vocal fry. , 1968, The Journal of the Acoustical Society of America.

[10]  - Jones, D., An Outline of English Phonetics, 9 th ed, Cambridge, , 1970 .

[11]  H Hollien,et al.  Vocal fold vibratory patterns of pulse register phonation. , 1977, Folia phoniatrica.

[12]  R A Harshman,et al.  Crosslanguage Differences in Tone Perception: a Multidimensional Scaling Investigation , 1978, Language and speech.

[13]  Raymond H. Colton,et al.  Elements of Voice Quality: Perceptual, Acoustic, and Physiologic Aspects , 1981 .

[14]  P. Milenkovic,et al.  Least mean square measures of voice perturbation. , 1987, Journal of speech and hearing research.

[15]  Jack Gandour,et al.  Perception and production of tone in aphasia , 1988, Brain and Language.

[16]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[17]  J. Kreiman,et al.  Listener experience and perception of voice quality. , 1988, Journal of speech and hearing research.

[18]  Donald G. Childers,et al.  Modeling vocal disorders via formant synthesis , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[19]  B. Hammarberg,et al.  Vocal Fold Physiology: Acoustic, Perceptual, and Physiological Aspects of Voice Mechanisms , 1991 .

[20]  H. Kasuya Acoustic analysis, synthesis and perception of breathy voice , 1991 .

[21]  Hanspeter Herzel,et al.  CHAOS AND BIFURCATIONS DURING VOICED SPEECH , 1991 .

[22]  D G Childers,et al.  Vocal quality factors: analysis, synthesis, and perception. , 1991, The Journal of the Acoustical Society of America.

[23]  Robert C. Peppard,et al.  Aerodynamic, laryngoscopic, and perceptual-acoustic characteristics in dysphonic females with posterior glottal chinks: A retrospective study , 1992 .

[24]  Christer Gobl,et al.  Acoustic characteristics of voice quality , 1992, Speech Commun..

[25]  J. Kreiman,et al.  Perceptual evaluation of voice quality: review, tutorial, and a framework for future research. , 1993, Journal of speech and hearing research.

[26]  J Kreiman,et al.  Comparing internal and external standards in voice quality judgments. , 1993, Journal of speech and hearing research.

[27]  William L. Hays,et al.  Statistics, 5th ed. , 1994 .

[28]  V. Wolfe,et al.  Pathologic voice type and the acoustic prediction of severity. , 1995, Journal of speech and hearing research.

[29]  J Kreiman,et al.  The perceptual structure of pathologic voice quality. , 1996, The Journal of the Acoustical Society of America.

[30]  J. Hillenbrand,et al.  Acoustic correlates of breathy vocal quality: dysphonic voices and continuous speech. , 1996, Journal of speech and hearing research.

[31]  D. Berry,et al.  Bifurcations in excised larynx experiments. , 1995, Journal of voice : official journal of the Voice Foundation.

[32]  Effects of Perceptual Training Based upon Synthesized Voice Signals , 1996, Perceptual and motor skills.

[33]  K. Omori,et al.  Acoustic characteristics of rough voice: subharmonics. , 1997, Journal of voice : official journal of the Voice Foundation.

[34]  A. Gilbert,et al.  Auditory Pitch as a Perceptual Analogue to Odor Quality , 1997 .

[35]  J Kreiman,et al.  Validity of rating scale measures of voice quality. , 1998, The Journal of the Acoustical Society of America.

[36]  Jody Kreiman,et al.  Toward a taxonomy of nonmodal phonation , 2001, J. Phonetics.