Perceptual relevance of objectively measured descriptors for speaker characterization

Subjective testing of speaker recognizability is an intricate, time consuming and very expensive process, but using objectively measurable descriptors to augment the subjective speaker recognizability tests could result in increased efficiency and reliability. This paper describes our investigation into the relevancy of a set of objective descriptors to human perception of speaker identity through multidimensional scaling (MDS) of subjective speaker pair similarity judgments. The evaluated objective descriptors can achieve same/different detection error rates as low as 4.13% for male speaker pairs, and 8.17% for female speaker pairs, with only 3 seconds of speech. Five descriptors related to glottal, vocal tract and prosodic features were found to have significant correlations with the perceptual dimensions of the MDS solutions.

[1]  Derek P. Brock,et al.  Speaker recognizability testing for voice coders , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[2]  W. Voiers Perceptual Bases of Speaker Identity , 1964 .

[3]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[4]  J Kreiman,et al.  Individual differences in voice quality perception. , 1989, Journal of speech and hearing research.

[5]  William D. Voiers Toward the development of practical methods of evaluating speaker recognizability , 1979, ICASSP.

[6]  B E Walden,et al.  Correlates of psychological dimensions in talker similarity. , 1978, Journal of speech and hearing research.

[7]  R. Kirlin,et al.  A posteriori estimation of vocal tract length , 1978 .

[8]  S. Hiki,et al.  Multidimensional representation of personal quality of vowels and its acoustical correlates , 1973 .

[9]  A. Paige,et al.  Calculation of vocal tract length , 1970 .

[10]  T Murry,et al.  Multidimensional classification of normal voice qualities. , 1977, The Journal of the Acoustical Society of America.

[11]  Thomas P. Barnwell,et al.  Objectively measured descriptors applied to speaker characterization , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[12]  Thomas P. Barnwell,et al.  Reliability assessment and evaluation of objectively measured descriptors for perceptual speaker characterization , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.