Functional differences between vowel onsets and offsets in temporal perception of speech: local-change detection and speaking-rate discrimination.

To provide a perceptual framework for the objective evaluation of durational rules in speech synthesis, two experiments were conducted to investigate the differences between vowel (V) onsets and V-offsets in their functions of marking the perceived temporal structure of speech. The first experiment measured the detectability of temporal modifications given in four-mora (CVCVCVCV) Japanese words. In the V-onset condition, the inter-onset intervals of vowels were uniformly changed (either expanded or reduced) while their inter-offset intervals were preserved. In the V-offset condition, this was reversed. These manipulations did not change the duration of the entire word. Each of the modified words was paired with its unmodified counterpart, and the pair was given to listeners, who were asked to rate the difference between the paired words. The results show that there were no significant differences in the listeners' abilities to detect the temporal modification between the V-onset and V-offset conditions. In the second experiment, the listeners were asked to estimate the differences they perceived in speaking rates for the same stimulus set as that of the first experiment. Interestingly, the results show a clear difference in the listeners' performance between the V-onset and V-offset conditions. Specifically, changing the V-onset intervals changed the perceived speaking rates, which showed a linear relation (r = -0.9) despite the fact that the duration of the entire word remained unchanged. In contrast, modifying the V-offset intervals produced no clear relation with the perceived speaking rates. The second experiment also showed that the listeners performed well in speaking rate discrimination (3.5%-5% in the change ratio). These results are discussed in relation to the differences in the listeners' temporal processing range (local or global) between the two experiments.

[1]  K B Snell,et al.  Duration discrimination of speech and tonal complex stimuli by normally hearing and hearing-impaired listeners. , 1988, The Journal of the Acoustical Society of America.

[2]  A V Ventsov Temporal information processing in speech perception. , 1981, Phonetica.

[3]  J. Michon,et al.  STUDIES ON SUBJECTIVE DURATION. I. DIFFERENTIAL SENSITIVITY IN THE PERCEPTION OF REPEATED TEMPORAL INTERVALS. , 1964, Acta psychologica.

[4]  Tsuneo Yamada,et al.  Perceptual learning of second-language syllable rhythm by elderly listeners , 2002, INTERSPEECH.

[5]  S M Abel,et al.  Duration discrimination of noise and tone bursts. , 1972, The Journal of the Acoustical Society of America.

[6]  S. M. Marcus Acoustic determinants of perceptual center (P-center) location , 1981, Perception & psychophysics.

[7]  Shigeru Katagiri,et al.  ATR Japanese speech database as a tool of speech recognition and synthesis , 1990, Speech Commun..

[8]  Jan P. H. van Santen,et al.  Assignment of segmental duration in text-to-speech synthesis , 1994, Comput. Speech Lang..

[9]  Yoshinori Sagisaka,et al.  Effects of phoneme class and duration on the acceptability of temporal modifications in speech. , 2002, The Journal of the Acoustical Society of America.

[10]  L. Allan The perception of time , 1979 .

[11]  R. Carlson,et al.  A Search for Durational Rules in a Real-Speech Data Base , 1986 .

[12]  R. Jaenisch,et al.  Chromosomal mapping of four different integration sites of Moloney murine leukemia virus including the locus for alpha 1(I) collagen in mouse. , 1986, Cytogenetics and cell genetics.

[13]  S. Grondin,et al.  From physical time to the first and second moments of psychological time. , 2001, Psychological bulletin.

[14]  D. Klatt Vowel Lengthening is Syntactically Determined in a Connected Discourse. , 1975 .

[15]  Minoru Tsuzaki,et al.  Evidence for functional differences between rise and fall markers in discrimination of auditory filled durations , 1998 .

[16]  Y. Sagisaka,et al.  Acceptability for temporal modification of consecutive segments in isolated words. , 1997, The Journal of the Acoustical Society of America.

[17]  C. Drake,et al.  Tempo sensitivity in auditory sequences: Evidence for a multiple-look model , 1993, Perception & psychophysics.

[18]  G. Allen The Location of Rhythmic Stress Beats in English: an Experimental Study I , 1972, Language and speech.

[19]  Piet G. Vos,et al.  Perceived tempo change is dependent on base tempo and direction of change: Evidence for a generalized version of Schulze's (1978) internal beat model , 1997 .

[20]  J. Devin McAuley,et al.  Effect of deviations from temporal expectations on tempo discrimination of isochronous tone sequences. , 1998, Journal of experimental psychology. Human perception and performance.

[21]  S. Abel Discrimination of Temporal Gaps , 1971 .

[22]  A. Huggins,et al.  Just noticeable differences for segment duration in natural speech. , 1969, The Journal of the Acoustical Society of America.

[23]  Y. Sagisaka,et al.  Acceptability for temporal modification of single vowel segments in isolated words. , 1998, The Journal of the Acoustical Society of America.

[24]  Ilse Lehiste,et al.  The perception of duration within sequences of four intervals , 1979 .