Integrating sentence- and word-level error identification for disfluency correction

While speaking spontaneously, speakers often make errors such as self-correction or false starts which interfere with the successful application of natural language processing techniques like summarization and machine translation to this data. There is active work on reconstructing this errorful data into a clean and fluent transcript by identifying and removing these simple errors. Previous research has approximated the potential benefit of conducting word-level reconstruction of simple errors only on those sentences known to have errors. In this work, we explore new approaches for automatically identifying speaker construction errors on the utterance level, and quantify the impact that this initial step has on word- and sentence-level reconstruction accuracy.

[1]  Sharon L. Oviatt,et al.  Predicting spoken disfluencies during human-computer interaction , 1995, Comput. Speech Lang..

[2]  FlickingerDan On building a more efficient grammar by exploiting types , 2000 .

[3]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[4]  Hal Daumé Notes on CG and LM-BFGS Optimization of Logistic Regression , 2008 .

[5]  Qi Zhang,et al.  Exploring Features for Identifying Edited Regions in Disfluent Sentences , 2005, IWPT.

[6]  Elisabeth Schriberg,et al.  Preliminaries to a Theory of Speech Disfluencies , 1994 .

[7]  Andreas Stolcke,et al.  THE ICSI/SRI/UW RT04 STRUCTURAL METADATA EXTRACTION SYSTEM , 2004 .

[8]  Tanja Schultz,et al.  Automatic disfluency removal on recognized spontaneous speech - rapid adaptation to speaker-dependent disfluencies , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  Yi Zhang,et al.  Partial Parse Selection for Robust Deep Processing , 2007, ACL 2007.

[10]  Frederick Jelinek,et al.  Linguistic Resources for Reconstructing Spontaneous Speech Text , 2008, LREC.

[11]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Noam Chomsky,et al.  Remarks on Nominalization , 2020, Nominalization.

[13]  Frederick Jelinek,et al.  Reconstructing False Start Errors in Spontaneous Speech Text , 2009, EACL.

[14]  Sharon L. Oviatt,et al.  Predicting and Managing Spoken Disfluencies During Human-Computer Interaction , 1994, HLT.

[15]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[16]  Matthew P. Aylett,et al.  Is disfluency just difficulty? , 2001, DiSS.

[17]  Eugene Charniak,et al.  A TAG-based noisy-channel model of speech repairs , 2004, ACL.

[18]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[19]  Frederick Jelinek,et al.  Reconstructing spontaneous speech , 2009 .

[20]  Matthew Lease,et al.  Effective Use of Prosody in Parsing Conversational Speech , 2005, HLT.