A MODELING OF THE OBJECTIVE EVALUATION OF DURATIONAL RULES BASED ON AUDITORY PERCEPTUAL CHARACTERISTICS

Human subjective acceptability of temporal distortions in speech segments is significantly affected by several phonetic factors, e.g., the vowel color. The current study proposes a modeling of temporal error evaluation for synthetic rules that can predict, to some extent, acceptability to humans (a subjective measure) from only objective measures (physical properties) of speech signals based on auditory perceptual characteristics recently found by the authors. To accomplish this, the loudness contour is calculated as a main cue for temporal change of a speech signal. The results of an experiment to test the effectiveness of the model showed that the proposed model consistently achieved a better prediction (i.e., closer to human evaluation) than the reference model, which only used the average acoustic errors without any perceptual consideration.