Segmental intelligibility and speech interference thresholds of high-quality synthetic speech in presence of noise.

Technological advancement in the area of synthetic speech has made it increasingly difficult to distinguish quality of speech based solely on intelligibility scores obtained in benign laboratory conditions. Intelligibility scores obtained for natural speech and a high-quality text-to-speech system (DECtalk) are not substantially different. This study examined the perceived intelligibility and speech interference thresholds of DECtalk male and female voices and compared them with data obtained for natural speech. Results revealed that decreasing signal-to-noise levels had more deleterious effects on the perception of DECtalk male and female voices than on the perception of natural speech. Analysis of pattern of phoneme errors revealed that similar general patterns of errors tended to occur in DECtalk and in natural speech. The speech interference test did not demonstrate any significant difference between the DECtalk male and female voices. These results were supported by the absence of a significant difference between DECtalk male and female voices during intelligibility testing at different signal-to-noise ratios.

[1]  L H Nakatani,et al.  A sensitive test of speech communication quality. , 1973, The Journal of the Acoustical Society of America.

[2]  J. Goodman,et al.  Perceptual masking of spondees by combinations of talkers , 1975 .

[3]  Phillip Dermody,et al.  Assessment of evaluation measures for processed speech , 1987, Speech Commun..

[4]  J Reichle,et al.  The intelligibility of synthesized speech: ECHO II versus VOTRAX. , 1987, Journal of speech and hearing research.

[5]  D.B. Pisoni,et al.  Perception of synthetic speech generated by rule , 1985, Proceedings of the IEEE.

[6]  David B Pisoni,et al.  Constraints on the perception of synthetic speech generated by rule , 1985, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[7]  Berry Eggen Intelligibility of synthetic speech in the presence of interfering speech , 1989, Speech Commun..

[8]  John E. Clark Intelligibility comparisons for two synthetic and one natural speech source , 1983 .

[9]  D B Pisoni,et al.  Segmental intelligibility of synthetic speech produced by rule. , 1989, The Journal of the Acoustical Society of America.

[10]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[11]  Pamela Mitchell,et al.  A comparison of the single word intelligibility of two voice output communication aids , 1989 .

[12]  B. J. Winer Statistical Principles in Experimental Design , 1992 .

[13]  C. Nixon,et al.  The Perception of Synthetic Speech in Noise , 1986 .

[14]  R. Carhart,et al.  Measurement of articulation functions using adaptive test procedures , 1973 .

[15]  L L Elliott,et al.  Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. , 1977, The Journal of the Acoustical Society of America.

[16]  David R. Beukelman,et al.  A comparison of speech synthesis intelligibility with listeners from three age groups , 1987 .

[17]  Leo Llm Vogten Evaluation of LPC formant coded speech with a speech intereference test , 1980 .

[18]  H. Levitt Transformed up-down methods in psychoacoustics. , 1971, The Journal of the Acoustical Society of America.

[19]  W. D. Voiers,et al.  Diagnostic Evaluation of Speech Intelligibility , 1977 .

[20]  David R. Beukelman,et al.  A comparison of intelligibility among natural speech and seven speech synthesizers with listeners from three age groups , 1990 .

[21]  H Levitt,et al.  Use of a sequential strategy in intelligibility testing. , 1967, The Journal of the Acoustical Society of America.

[23]  K. D. Kryter,et al.  ARTICULATION-TESTING METHODS: CONSONANTAL DIFFERENTIATION WITH A CLOSED-RESPONSE SET. , 1965, The Journal of the Acoustical Society of America.

[24]  I. Lehiste ACOUSTICAL CHARACTERISTICS OF SELECTED ENGLISH CONSONANTS , 1965 .