论文信息 - A MODELING OF THE OBJECTIVE EVALUATION OF DURATIONAL RULES BASED ON AUDITORY PERCEPTUAL CHARACTERISTICS

A MODELING OF THE OBJECTIVE EVALUATION OF DURATIONAL RULES BASED ON AUDITORY PERCEPTUAL CHARACTERISTICS

Human subjective acceptability of temporal distortions in speech segments is significantly affected by several phonetic factors, e.g., the vowel color. The current study proposes a modeling of temporal error evaluation for synthetic rules that can predict, to some extent, acceptability to humans (a subjective measure) from only objective measures (physical properties) of speech signals based on auditory perceptual characteristics recently found by the authors. To accomplish this, the loudness contour is calculated as a main cue for temporal change of a speech signal. The results of an experiment to test the effectiveness of the model showed that the proposed model consistently achieved a better prediction (i.e., closer to human evaluation) than the reference model, which only used the average acoustic errors without any perceptual consideration.

Yoshinori Sagisaka | Minoru Tsuzaki | Hiroaki Kato

[1] Albert S. Bregman,et al. The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[2] S M Abel,et al. Duration discrimination of noise and tone bursts. , 1972, The Journal of the Acoustical Society of America.

[3] Yoshinori Sagisaka,et al. Effects of phonetic quality and duration on perceptual acceptability of temporal changes in speech , 1998, ICSLP.

[4] Y. Sagisaka,et al. Acceptability for temporal modification of consecutive segments in isolated words. , 1997, The Journal of the Acoustical Society of America.

[5] Y. Sagisaka,et al. Acceptability for temporal modification of single vowel segments in isolated words. , 1998, The Journal of the Acoustical Society of America.

[6] A. Huggins,et al. On the perception of temporal phenomena in speech. , 1972, The Journal of the Acoustical Society of America.