ABSTRACT Discourse boundaries have been associated with an increased rate of disfluent events. It is hypothesized that the reason for this increase is the heavy processing requirement incurred either in planning the next chunk of discourse or in the introduction of many new or high perplexity entities. In a sample of academic lecture speech, we find that non-error disfluencies (such as filled pauses) occur preferentially shortly after (but not right at) the beginning of a new discourse segment. This suggests that the processing load may not increase just at the boundary but instead somewhat later, i.e. that the speaker can make use of the results of earlier planning during the first portion of the new segment. In contrast, errors of selection or serial ordering of grammatical elements do not show a boundary-related peak in their distribution across a discourse segment, supporting the hypothesis that this second kind of nonfluent event arises at a different point in the speech production planning process. Keywords: Disfluencies, discourse structure, speech errors. 1. BACKGROUND Discourse boundaries have been associated with a variety of acoustic events, including an increased rate of disfluencies [4] [10] . For example, in an experimental study of spontaneous Dutch monologues, Swerts et al. [10] found that the distribution of filled pauses varied by strength of discourse boundary: they were more prevalent in the Intonational Phrase just after a strong discourse boundary (one labelled by more than 75% of labeller subjects) than after a weak boundary (one labelled by fewer than 75% of labellers). Watanabe found similar results for informal Japanese spontaneous speech [14] , although not in academic lectures and prepared conference talks [13] . More generally, Arnold and colleagues [1] found a correlation between disfluencies and new information in English, and showed that disfluencies (including repairs, repeats, metalinguistic comments as well as filled pauses) occur more frequently at the beginnings of utterances. A subsequent perceptual study [2] revealed that disfluencies cue listeners to the presence of new, and presumably more difficult to process, information. These studies focused on dialogs comprised of short utterances. However, an association between disfluencies and the onsets of larger discourse segments would also be expected, since theories of discourse, such as Grosz et al.s Centering theory [6] [7] , would place more new (or re-introduced but still higher perplexity) information at the beginnings of new discourse segments. In a study of a longer dialogue in English, Veilleux [12] described instabilities, i.e. general areas of disfluency (as well as the presence of shorter discourse segments) in regions between long, stable discourse segments. In contrast to the work cited above, she examined both sides of the discourse boundary, and found instability on both sides. She postulated that these regions were bridges between stable discourse segments, i.e. discourse regions where participants in a dialogue negotiate what new topic will follow. These results suggest two questions: 1. Does nonfluent speech occur more frequently in discourse segment initial or final positions than elsewhere? 2. Do different kinds of disfluencies behave differently in this regard? This work explores these two questions by examining two types of nonfluent speech (lexical and non-lexical) to determine whether the likelihood of a nonfluency changes across a discourse segment, and whether the distribution pattern varies for different types of nonfluency.
[1]
M. Swerts.
Prosodic features at discourse boundaries of different strength.
,
1997,
The Journal of the Acoustical Society of America.
[2]
Michiko Watanabe.
Fillers as Indicators of Discourse Segment Boundaries in Japanese Monologues
,
2002
.
[3]
S. Shattuck-Hufnagel.
The role of word structure in segmental serial ordering
,
1992,
Cognition.
[4]
Candace L. Sidner,et al.
Attention, Intentions, and the Structure of Discourse
,
1986,
CL.
[5]
Julia Hirschberg,et al.
Some intonational characteristics of discourse structure
,
1992,
ICSLP.
[6]
Jennifer E. Arnold,et al.
Heaviness vs. newness: The effects of structural complexity and discourse status on constituent ordering
,
2015
.
[7]
M. Swerts.
Filled pauses as markers of discourse structure
,
1998
.
[8]
Scott Weinstein,et al.
Centering: A Framework for Modeling the Local Coherence of Discourse
,
1995,
CL.
[9]
Stefanie Shattuck-Hufnagel,et al.
The original ToBI system and the evolution of the ToBI framework
,
2003
.
[10]
H. H. Clark,et al.
Using uh and um in spontaneous speaking
,
2002,
Cognition.
[11]
Nanette Veilleux.
Bridges: regions between discourse segments
,
2002,
INTERSPEECH.
[12]
Jennifer E. Arnold,et al.
Disfluencies Signal Theee, Um, New Information
,
2003,
Journal of psycholinguistic research.
[13]
Elizabeth Shriberg.
To ‘errrr’ is human: ecology and acoustics of speech disfluencies
,
2001,
Journal of the International Phonetic Association.