The output of a speech recognition system is not always ideal for subsequent downstream processing, in part because speakers themselves often make mistakes. A system would accomplish speech reconstruction of its spontaneous speech input if its output were to represent, in flawless, fluent, and content-preserving English, the message that the speaker intended to convey. These cleaner speech transcripts would allow for more accurate language processing as needed for NLP tasks such as machine translation and conversation summarization, which often rely on grammatical input. Recognizing that supervised statistical methods to identify and transform ill-formed areas of the transcript will require richly labeled resources, we have built the Spontaneous Speech Reconstruction corpus. This small corpus of reconstructed and aligned conversational telephone speech transcriptions for the Fisher conversational telephone speech corpus (Strassel and Walker, 2004) was annotated on several levels including string transformations and predicate-argument structure, and will be shared with the linguistic research community.
[1]
Elisabeth Schriberg,et al.
Preliminaries to a Theory of Speech Disfluencies
,
1994
.
[2]
Daniel Jurafsky,et al.
Automatic Labeling of Semantic Roles
,
2002,
CL.
[3]
Ulrich Callmeier,et al.
Efficient Parsing with Large-Scale Unification Grammars
,
2001
.
[4]
Daniel Gildea,et al.
The Proposition Bank: An Annotated Corpus of Semantic Roles
,
2005,
CL.
[5]
Eugene Charniak,et al.
A TAG-based noisy-channel model of speech repairs
,
2004,
ACL.
[6]
Ivan A. Sag,et al.
Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar
,
1996,
CL.
[7]
Daniel Gildea,et al.
Automatic Labeling of Semantic Roles
,
2000,
ACL.
[8]
Daniel Jurafsky,et al.
Shallow Semantic Parsing using Support Vector Machines
,
2004,
NAACL.