Form variation of English function words in conversation

Function words (the, that, and, of, . . . ) vary widely in pronunciation. Understanding this variation is essential both for cognitive modeling of lexical production and for computer speech recognition and synthesis. This study investigates which factors affect the forms of function words, especially whether they have a fuller pronunciation (e.g., , , , ) or a more reduced or lenited pronunciation (e.g., , , , ). It is based on over 8000 occurrences of ten frequent English function words in a four-hour sample from conversations from the Switchboard corpus. Ordinary linear and logistic regression models were used to examine variation in the length of the words, in the form of their vowel (basic, full, or reduced), and whether final obstruents were present or not. For all these measures, after controlling for segmental context, rate of speech, and other important factors, there are strong independent effects that made function words more likely to be longer or have a fuller form (1) when neighboring disfluencies (such as filled pauses uh and um) indicate that the speaker was encountering problems in planning the utterance; (2) when the word is unexpected, i.e less predictable in context; (3) when the word is either utterance-initial or utterance-final. Looking at the phenomenon in a different way, function words are more likely to be shorter and to have less full forms in fluent speech, in predictable positions or multi-word collocations, and utterance-internally. Also considered are other factors such as sex (women are more likely to use fuller forms, even after controlling for rate of speech, for example), and some of the differences among the ten function words in their response to the factors.

[1]  G. A. Barnard,et al.  Transmission of Information: A Statistical Theory of Communications. , 1961 .

[2]  D. Klatt Vowel Lengthening is Syntactically Determined in a Connected Discourse. , 1975 .

[3]  G S Dell,et al.  A spreading-activation theory of retrieval in sentence production. , 1986, Psychological review.

[4]  T. Crystal,et al.  Articulation rate and the duration of syllables and stress groups in connected speech. , 1990, The Journal of the Acoustical Society of America.

[5]  A. Agresti An introduction to categorical data analysis , 1997 .

[6]  C. Browman,et al.  Articulatory Phonology: An Overview , 1992, Phonetica.

[7]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Dani Byrd,et al.  Phonetic analyses of word and segment variation using the TIMIT corpus of American english , 1994, Speech Commun..

[9]  W. Levelt,et al.  Word frequency effects in speech production: Retrieval of syntactic information and of phonological form , 1994 .

[10]  Dani Byrd,et al.  Relations of sex and dialect to reduction , 1994, Speech Communication.

[11]  E. Shriberg,et al.  Acoustic properties of disfluent repetitions , 1995 .

[12]  W. Bruce Croft Intonation units and grammatical structure , 1995 .

[13]  R N Aslin,et al.  Statistical Learning by 8-Month-Old Infants , 1996, Science.

[14]  Steven Greenberg,et al.  INSIGHTS INTO SPOKEN LANGUAGE GLEANED FROM PHONETIC TRANSCRIPTION OF THE SWITCHBOARD CORPUS , 1996 .

[15]  Richard A. Rhodes,et al.  English reduced vowels and the nature of natural processes , 1996 .

[16]  Daniel Jurafsky,et al.  A Probabilistic Model of Lexical and Syntactic Access and Disambiguation , 1996, Cogn. Sci..

[17]  Jean E. Fox Tree,et al.  Pronouncing “the” as “thee” to signal problems in speaking , 1997, Cognition.

[18]  P. Keating,et al.  Articulatory strengthening at edges of prosodic domains. , 1997, The Journal of the Acoustical Society of America.

[19]  William D. Raymond,et al.  Reduction of English function words in switchboard , 1998, ICSLP.

[20]  Zenzi M. Griffin,et al.  Constraint, Word Frequency, and the Relationship between Lexical Processing Levels in Spoken Word Production , 1998 .

[21]  M. Krug String Frequency , 1998 .

[22]  H. H. Clark,et al.  Repeating Words in Spontaneous Speech , 1998, Cognitive Psychology.

[23]  Eric Fosler-Lussier CONTEXTUAL WORD AND SYLLABLE PRONUNCIATION MODELS , 1999 .

[24]  William D. Raymond,et al.  The effects of collocational strength and contextual predictability in lexical production 1 , 1999 .

[25]  Eric Fosler-Lussier,et al.  Effects of speaking rate and word frequency on pronunciations in convertional speech , 1999, Speech Commun..

[26]  Nelson Morgan,et al.  Dynamic pronunciation models for automatic speech recognition , 1999 .

[27]  Madelaine C. Plauché,et al.  DATA-DRIVEN SUBCLASSIFICATION OF DISFLUENT REPETITIONS BASED ON PROSODIC FEATURES , 1999 .

[28]  Maryellen C. MacDonald,et al.  A probabilistic constraints approach to language acquisition and processing , 1999, Cogn. Sci..

[29]  Elizabeth Shriberg,et al.  Phonetic Consequences of Speech Disfluency , 1999 .

[30]  Anne Cutler,et al.  A theory of lexical access in speech production , 1999, Behavioral and Brain Sciences.

[31]  Daniel Jurafsky,et al.  The role of the lemma in form variation , 2002 .

[32]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.