On the just noticeable difference for tempo in speech

Abstract Speakers vary their speech tempo (speaking rate), and such variations in tempo are quite noticeable. But what is the just noticeable difference (JND) for tempo in speech? The present study aims at providing a realistic and robust estimate, by using multiple speech tokens from multiple speakers. The JND is assessed in two (2IAX and 2IFC) comparison experiments, yielding an estimated JND for speech tempo of about 5%. A control experiment suggests that this finding is not due to acoustic artefacts of the tempo-transformation method used. Tempo variations within speakers typically exceed this JND, which makes such variations relevant in speech communication.

[1]  Esther Janse,et al.  Word perception in fast speech: artificially time-compressed vs. naturally produced fast speech , 2004, Speech Commun..

[2]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[3]  Nelleke Oostdijk,et al.  Het Corpus Gesproken Nederlands , 1999 .

[4]  Peter Howard Fries Relations and functions within and around language , 2002 .

[5]  Eric Moulines,et al.  Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[6]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[7]  Hugo Quené,et al.  Word-level intelligibility of time-compressed speech: prosodic and segmental factors , 2003, Speech Commun..

[8]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[9]  G. Allen The Location of Rhythmic Stress Beats in English: an Experimental Study I , 1972, Language and speech.

[10]  A.-P. Benguerel,et al.  Time-warping and the perception of rhythm in speech , 1986 .

[11]  F. Goldman-Eisler Psycholinguistics: Experiments in spontaneous speech , 1968 .

[12]  E. den Os,et al.  Perception of Speech Rate of Dutch and Italian Utterances , 1985, Phonetica.

[13]  James C. Humes Speak Like Churchill, Stand Like Lincoln: 21 Powerful Secrets of History's Greatest Speakers , 2002 .

[14]  Sieb G. Nooteboom,et al.  Accentuation, information value and word duration: Effects on speech production, naturalness and sentence processing , 1993 .

[15]  Guy Madison,et al.  Detection of linear temporal drift in sound sequences: empirical data and modelling principles. , 2004, Acta psychologica.

[16]  H. Zwaardemaker,et al.  Leerboek der phonetiek : inzonderheid met betrekking tot het Standaard-Nederlandsch , 1928 .

[17]  Monique Biemans Gender variation in voice quality , 2000 .

[18]  Neil A. Macmillan,et al.  Detection Theory: A User's Guide , 1991 .

[19]  Marius Perron Checking Tempo Stability of MIDI Sequencers , 1994 .

[20]  Vincent J. van Heuven,et al.  Analysis and synthesis of speech: strategic research towards high-quality text-to-speech generation , 1993 .

[21]  A. Marchal,et al.  Speech production and speech modelling , 1990 .

[22]  C. Drake,et al.  Tempo sensitivity in auditory sequences: Evidence for a multiple-look model , 1993, Perception & psychophysics.

[23]  Richard Ragot,et al.  Processes involved in tempo perception: a CNV analysis. , 2003, Psychophysiology.

[24]  Toni C. M. Rietveld,et al.  Just noticeable differences of articulation rate at sentence level , 1989, Speech Commun..

[25]  P. Cook,et al.  Memory for musical tempo: Additional evidence that auditory memory is absolute , 1996, Perception & psychophysics.

[26]  Henkjan Honing,et al.  Evidence for tempo-specific timing in music using a web-based experimental setup. , 2006, Journal of experimental psychology. Human perception and performance.

[27]  Mark C. Ellis Research Note. Thresholds for Detecting Tempo Change , 1991 .

[28]  R.W.N.M. van Hout,et al.  De uitspraak van het Standaard-Nederlands: variatie en varianten in Vlaanderen en Nederland , 1999 .

[29]  N. Macmillan,et al.  The psychophysics of categorical perception. , 1977, Psychological review.

[30]  Hugo Quené,et al.  Multilevel modeling of between-speaker and within-speaker variation in spontaneous speech tempo. , 2008, The Journal of the Acoustical Society of America.

[31]  Kuldip K. Paliwal,et al.  Speech Coding and Synthesis , 1995 .

[32]  Hank Heijink,et al.  The Influence of Musical Context on Tempo Rubato , 2000 .

[33]  H. H. Clark The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. , 1973 .

[34]  Simon Grondin,et al.  A Response-Time Approach for Estimating Sensitivity to Auditory Tempo Changes , 2005 .

[35]  J L Miller,et al.  Internal Structure of Phonetic Categories: Effects of Speaking Rate , 1997, Phonetica.

[36]  S G Nooteboom,et al.  Evidence for the Adaptive Nature of Speech on the Phrase Level and Below , 1994, Phonetica.

[37]  É. Moulines,et al.  Time-Domain and Frequency-Domain Techniques for Prosodic Modification of Speech , 1995 .

[38]  Carol M. Megehee,et al.  Time Versus Pause Manipulation in communications directed to the young adult population: does it matter? , 2003, Journal of Advertising Research.