Processing Self Corrections in a speech to speech system

Speech repairs occur often in spontaneous spoken dialogues. The ability to detect and correct those repairs is necessary for any spoken language system. We present a framework to detect and correct speech repairs where all relevant levels of information, i. e., acoustics, lexis, syntax and semantics can be integrated. The basic idea is to reduce the search space for repairs as soon as possible by cascading filters that involve more and more features. At first an acoustic module generates hypotheses about the existence of a repair. Second a stochastic model suggests a correction for every hypothesis. Well scored corrections are inserted as new paths in the word lattice. Finally a lattice parser decides on accepting the repair.

[1]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[2]  Dietrich Klakow,et al.  OOV-detection in large vocabulary system using automatically defined word-fragments as fillers , 1999, EUROSPEECH.

[3]  John Bear,et al.  Integrating Multiple Knowledge Sources for Detection and Correction of Repairs in Human-Computer Dialog , 1992, ACL.

[4]  Donald Hindle,et al.  Deterministic Parsing of Syntactic Non-fluencies , 1983, ACL.

[5]  Elisabeth Schriberg,et al.  Preliminaries to a Theory of Speech Disfluencies , 1994 .

[6]  James F. Allen,et al.  Speech repains, intonational phrases, and discourse markers: modeling speakers’ utterances in spoken dialogue , 1999, CL.

[7]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[8]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[9]  Douglas E. Appelt,et al.  GEMINI: A Natural Language System for Spoken-Language Understanding , 1993, ACL.

[10]  Peter A. Heeman,et al.  Speech Repairs, Intonational Boundaries and Discourse Markers: Modeling Speakers' Utterances in Spoken Dialog , 1997, ArXiv.

[11]  Lenhart K. Schubert,et al.  SPEECH REPAIRS: A PARSING PERSPECTIVE , 1999 .

[12]  Robert C. Moore,et al.  Gemini: a natural language system for spoken-language understanding , 1993 .

[13]  Elmar Nöth,et al.  M = Syntax + Prosody: A syntactic-prosodic labelling scheme for large spontaneous speech databases , 1998, Speech Commun..

[14]  W. Levelt,et al.  Monitoring and self-repair in speech , 1983, Cognition.

[15]  Gökhan Tür,et al.  Modeling the prosody of hidden events for improved word recognition , 1999, EUROSPEECH.

[16]  Christer Samuelsson A Left-to-right Tagger for Word Graphs , 1997, IWPT.