Stochastic suprasegmentals: relationships between redundancy, prosodic structure and care of articulation in spontaneous speech

Within spontaneous speech there are wide variations in the articulation of the same word by the same speaker. This paper explores two related factors which influence variation in articulation, prosodic structure and redundancy. We argue that the constraint of producing robust communication while efficiently expending articulatory effort leads to an inverse relationship between language redundancy and care of articulation. The inverse relationship improves robustness by spreading the information more evenly across the speech signal leading to a smoother signal redundancy profile. We argue that prosodic prominence is a linguistic means of achieving smooth signal redundancy. Prosodic prominence increases care of articulation and coincides with unpredictable sections of speech. By doing so, prosodic prominence leads to a smoother signal redundancy. Results confirm the strong relationship between prosodic prominence and care of articulation as well as an inverse relationship between language redundancy and care of articulation. In addition, when variation in prosodic boundaries is controlled for, language redundancy can predict up to 65% of the variance in raw syllabic duration. This is comparable with 64% predicted by prosodic prominence (accent, lexical stress and vowel type). Moreover most (62%) of this predictive power is shared. This suggests that, in English, prosodic structure is the means with which constraints caused by a robust signal requirement are expressed in spontaneous speech.

[1]  Anne H. Anderson,et al.  The Hcrc Map Task Corpus , 1991 .

[2]  Dennis Butler Fry The Physics of Speech , 1979 .

[3]  J. Laver,et al.  The handbook of phonetic sciences , 1999 .

[4]  Stefanie Shattuck-Hufnagel,et al.  A prosody tutorial for investigators of auditory sentence processing , 1996, Journal of psycholinguistic research.

[5]  E. Couper-Kuhlen English speech rhythm , 1993 .

[6]  Matthew Aylett Human Modelling Clarity Change In Spontaneous Speech , 1999 .

[7]  Stephen D. Goldinger,et al.  Lexical neighborhoods in speech production: A first report , 1989 .

[8]  D. G. Payne,et al.  Effects of Speech Intelligibility Level on Concurrent Visual Task Performance , 1994, Human factors.

[9]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[10]  D. Crystal,et al.  Intonation and Grammar in British English , 1967 .

[11]  Using statistics to model the vowel spaceMatthew , 1996 .

[12]  J. Neter,et al.  Applied linear statistical models : regression, analysis of variance, and experimental designs , 1974 .

[13]  W. V. Summers Effects of stress and final-consonant voicing on vowel production: articulatory and acoustic analyses. , 1987, The Journal of the Acoustical Society of America.

[14]  B C Moore,et al.  Simulation of the effects of loudness recruitment on the intelligibility of speech in noise. , 1995, British journal of audiology.

[15]  Samuel Jay Keyser,et al.  CV Phonology: A Generative Theory of the Syllable , 1988 .

[16]  D Granville,et al.  Stochastic Suprasegmentals: Relationships between Redundancy, Prosodic Structure and Syllabic Duration , 1999 .

[17]  P. Luce,et al.  A computational analysis of uniqueness points in auditory word recognition , 1986, Perception & psychophysics.

[18]  L. Braida,et al.  Speaking clearly for the hard of hearing IV: Further studies of the role of speaking rate. , 1996, Journal of speech and hearing research.

[19]  J. B. Pickering,et al.  Vowel Perception and Production , 1994 .

[20]  Laurence White,et al.  Structural influences on accentual lengthening in English , 1999 .

[21]  L D Braida,et al.  Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing. , 1994, The Journal of the Acoustical Society of America.

[22]  E. Couper-Kuhlen An introduction to English prosody , 1986 .

[23]  P Howell,et al.  Speaking clearly for the hearing impaired: intelligibility differences between clear and less clear speakers. , 1997, European journal of disorders of communication : the journal of the College of Speech and Language Therapists, London.

[24]  李幼升,et al.  Ph , 1989 .

[25]  Catherine Frances Sotillo Phonological reduction and intelligibility in task-oriented dialogue , 1997 .

[26]  Anne Cutler,et al.  Durational cues to word boundaries in clear speech , 1990, Speech Commun..

[27]  Dick R. van Bergem,et al.  Acoustic vowel reduction as a function of sentence accent, word stress, and word class , 1993, Speech Commun..

[28]  Matthew Aylett Modelling Clarity Change in Spontaneous Speech , 2000, Information Theory and the Brain.

[29]  Frantz Clermont,et al.  A methodology for modeling vowel formant contours in CVC context , 1987 .

[30]  Julia Hirschberg,et al.  Evaluation of prosodic transcription labeling reliability in the tobi framework , 1994, ICSLP.

[31]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[32]  John Hicks,et al.  Phonological reduction, assimilation, intra-word information structure, and the evolution of the lexicon of English: Why fast speech isn't confusing , 1995 .

[33]  P. Ladefoged A course in phonetics , 1975 .

[34]  Alan A. Wrench ANALYSIS OF FRICATIVES USING MULTIPLE CENTRES OF GRAVITY , 1999 .

[35]  John F. Pitrelli,et al.  A prosodic comparison of spontaneous speech and read speech , 1992, ICSLP.

[36]  Björn Lindblom,et al.  Economy of Speech Gestures , 1983 .

[37]  Zinny S. Bond,et al.  A note on the acoustic-phonetic characteristics of inadvertently clear speech , 1994, Speech Commun..

[38]  Matthew P. Aylett Building a statistical model of the vowel space for phoneticians , 1998, ICSLP.

[39]  D. Bolinger,et al.  LENGTH, VOWEL, JUNCTURE , 1963 .

[40]  David B. Pisoni,et al.  Speech perception, word recognition and the structure of the lexicon , 1985, Speech Commun..

[41]  P. Keating,et al.  Articulatory strengthening at edges of prosodic domains. , 1997, The Journal of the Acoustical Society of America.

[42]  Mari Ostendorf,et al.  The use of prosody in syntactic disambiguation , 1991 .

[43]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[44]  Tom Barney English Speech Rhythm: Form and Function in Everyday Verbal Interaction , 1994 .

[45]  B. Lindblom Spectrographic Study of Vowel Reduction , 1963 .

[46]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[47]  W. J. McGill Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.

[48]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[49]  W. Eefting The effect of ‘‘information value’’ and ‘‘accentuation’’ on the duration of Dutch words, syllables, and segments , 1991 .

[50]  D. Balota,et al.  Repetition and Associative Context Effects in Speech Production , 1991, Language and speech.

[51]  Alice Turk,et al.  The domain of accentual lengthening in American English , 1997 .

[52]  J E Flege,et al.  Effects of speaking rate on tongue position and velocity of movement in vowel production. , 1988, The Journal of the Acoustical Society of America.

[53]  G. Tucker Childs,et al.  Pitch Movements under Time Pressure: Effects of Speech Rate on the Melodic Marking of Accents and Boundaries in Dutch , 1998 .

[54]  Mari Ostendorf,et al.  Automatic recognition of prosodic phrases , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[55]  Arthur G. Samuel,et al.  Articulation Quality Is Inversely Related to Redundancy When Children or Adults Have Verbal Control , 1998 .

[56]  THE ANALYSIS OF STRESS AND JUNCTURE IN ENGLISH , 1960 .

[57]  D. Fry Experiments in the Perception of Stress , 1958 .

[58]  佐竹 元一郎,et al.  Applied Linear Statistical Models--Regression,Analysis of Variance,and Experimental Designs 3rd ed./John Neter et al.(1990) , 1991 .

[59]  David B. Pisoni,et al.  Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics , 1996, Speech Commun..

[60]  D. Klatt Linguistic uses of segmental duration in English: acoustic and perceptual evidence. , 1976, The Journal of the Acoustical Society of America.

[61]  E. Zwicker,et al.  Subdivision of the audible frequency range into critical bands , 1961 .

[62]  Wallace L. Chafe,et al.  Language and Consciousness. , 1974 .

[63]  Milton Lodge,et al.  Magnitude Scaling: Quantitative Measurement of Opinions , 1981 .

[64]  Sharon Hunnicutt,et al.  Intelligibility Versus Redundancy - Conditions of Dependency , 1985 .

[65]  C. Fowler Differential Shortening of Repeated Content Words Produced in Various Communicative Contexts , 1988, Language and speech.

[66]  Bruce Hayes,et al.  THE PROSODIC HIERARCHY IN METER , 1989 .

[67]  Peter F. MacNeilage,et al.  The Production of Speech , 2011, Springer New York.

[68]  M. Fourakis,et al.  Tempo, stress, and vowel reduction in American English. , 1991, The Journal of the Acoustical Society of America.

[69]  S. Nooteboom,et al.  THE PROSODY OF SPEECH: MELODY AND RHYTHM , 2001 .

[70]  Steve Young,et al.  The HTK book , 1995 .

[71]  Matthew P. Aylett,et al.  Prosodic transcription of Glasgow English: an evaluation study of GlaToBI , 1997 .

[72]  Matthew P. Aylett,et al.  Vowel quality in spontaneous speech: what makes a good vowel? , 1998, ICSLP.

[73]  John Kingston,et al.  Papers in Laboratory Phonology: Index of names , 1990 .

[74]  C. A. Ferguson,et al.  Talking to Children , 1977 .

[75]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[76]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[77]  J. Fodor Psychology and Language. , 1970 .

[78]  G. A. Miller,et al.  Statistical behavioristics and sequences of responses. , 1949, Psychological review.

[79]  Peter J. B. Hancock,et al.  Information Theory and the Brain , 2008 .

[80]  K. D. Jong The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation , 1995 .

[81]  Colin W. Wightman,et al.  Segmental durations in the vicinity of prosodic phrase boundaries. , 1992, The Journal of the Acoustical Society of America.

[82]  Anne H. Anderson,et al.  The control of intelligibility in running speech. , 1995 .

[83]  T. D. Hanley,et al.  Effect of level of distracting noise upon speaking rate, duration and intensity. , 1949, The Journal of speech disorders.

[84]  B. Lindblom,et al.  Interaction between duration, context, and speaking style in English stressed vowels , 1994 .

[85]  Matthew P. Aylett,et al.  The dissociation of deaccenting, Givenness, and syntactic role in spontaneous speech. , 1999 .

[86]  Jan Edwards,et al.  Papers in Laboratory Phonology: Lengthenings and shortenings and the nature of prosodic constituency , 1990 .

[87]  R. H. Baayen,et al.  The CELEX Lexical Database (CD-ROM) , 1996 .

[88]  C. Fowler,et al.  Talkers' signaling of new and old. words in speech and listeners' perception and use of the distinction , 1987 .

[89]  M. Beckman,et al.  Gesture, Segment, Prosody: Prosodic structure and tempo in a sonority model of articulatory dynamics , 1992 .

[90]  Richard M. Hogg,et al.  Metrical Phonology: a coursebook , 1988 .

[91]  N I Durlach,et al.  Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech. , 1985, Journal of speech and hearing research.

[92]  E. Zwicker,et al.  Analytical expressions for critical‐band rate and critical bandwidth as a function of frequency , 1980 .

[93]  Stephen Isard,et al.  Segment durations in a syllable frame , 1991 .

[94]  Sarah Hawkins,et al.  Phonetic influences on the intelligibility of conversational speech , 1994 .

[95]  P. Lieberman Some Effects of Semantic and Grammatical Context on the Production and Perception of Speech , 1963 .

[96]  Matthew P. Aylett,et al.  The automatic marking of prominence in spontaneous speech using duration and part of speech information , 1998, ICSLP.

[97]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[98]  Richard W. Wright,et al.  Lexical Competition and Reduction in Speech: A Preliminary Report 1 , 1997 .

[99]  I. Lehiste,et al.  Role of duration in disambiguating syntactically ambiguous sentences , 1975 .

[100]  Julie E. Boland,et al.  Priming in pronunciation: Beyond pattern recognition and onset latency , 1989 .

[101]  Peter Ladefoged,et al.  Elements of Acoustic Phonetics , 1962 .

[102]  J. Pierrehumbert,et al.  Intonational structure in Japanese and English , 1986, Phonology.

[103]  John R. Pierce,et al.  Symbols, Signals, and Noise: The Nature and Process of Communication. , 1961 .

[104]  Elisabeth Selkirk,et al.  Phonology and Syntax: The Relation between Sound and Structure , 1984 .