Planning and Generating Natural and Diverse Disfluent Texts as Augmentation for Disfluency Detection

Existing approaches to disfluency detection heavily depend on human-annotated data. Numbers of data augmentation methods have been proposed to alleviate the dependence on labeled data. However, current augmentation approaches such as random insertion or repetition fail to resemble training corpus well and usually resulted in unnatural and limited types of disfluencies. In this work, we propose a simple Planner-Generator based disfluency generation model to generate natural and diverse disfluent texts as augmented data, where the Planner decides on where to insert disfluent segments and the Generator follows the prediction to generate corresponding disfluent segments. We further utilize this augmented data for pretraining and leverage it for the task of disfluency detection. Experiments demonstrated that our two-stage disfluency generation model outperforms existing baselines; those disfluent sentences generated significantly aided the task of disfluency detection and led to state-of-the-art performance on Switchboard corpus.

[1]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[2]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[3]  Daniel C. O'Connell,et al.  Communicating with One Another: Toward a Psychology of Spontaneous Spoken Discourse , 2008 .

[4]  Eugene Charniak,et al.  Edit Detection and Parsing for Transcribed Speech , 2001, NAACL.

[5]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[6]  Qi Liu,et al.  Multi-Task Self-Supervised Learning for Disfluency Detection , 2019, AAAI.

[7]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[8]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[9]  Myle Ott,et al.  Understanding Back-Translation at Scale , 2018, EMNLP.

[10]  Dan Klein,et al.  Disfluency Detection with a Semi-Markov Model and Prosodic Features , 2015, HLT-NAACL.

[11]  Mirella Lapata,et al.  Coarse-to-Fine Decoding for Neural Semantic Parsing , 2018, ACL.

[12]  Mark Johnson,et al.  Disfluency Detection using a Noisy Channel Model and a Deep Neural Language Model , 2017, ACL.

[13]  Zheng Yuan,et al.  Generating artificial errors for grammatical error correction , 2014, EACL.

[14]  Yue Zhang,et al.  Transition-Based Disfluency Detection using LSTMs , 2017, EMNLP.

[15]  Mohammad Sadegh Rasooli,et al.  Joint Parsing and Disfluency Detection in Linear Time , 2013, EMNLP.

[16]  Hiroyuki Shindo,et al.  Joint Transition-based Dependency Parsing and Disfluency Detection for Automatic Speech Recognition Texts , 2016, EMNLP.

[17]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[18]  Nguyen Bach,et al.  Noisy BiLSTM-Based Models for Disfluency Detection , 2019, INTERSPEECH.

[19]  Dan Klein,et al.  An Empirical Investigation of Statistical Significance in NLP , 2012, EMNLP.

[20]  Ethan Dyer,et al.  Affinity and Diversity: Quantifying Mechanisms of Data Augmentation , 2020, ArXiv.

[21]  Julian Hough,et al.  Recurrent neural networks for incremental disfluency detection , 2015, INTERSPEECH.

[22]  Shuang Xu,et al.  Semi-Supervised Disfluency Detection , 2018, COLING.

[23]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[24]  Tiejun Zhao,et al.  Efficient Disfluency Detection with Transition-based Parsing , 2015, ACL.

[25]  Shuang Xu,et al.  Adapting Translation Models for Transcript Disfluency Detection , 2019, AAAI.

[26]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[27]  Petra Wagner,et al.  Micro-structure of disfluencies: basics for conversational speech synthesis , 2015, INTERSPEECH.

[28]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[29]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30]  Mari Ostendorf,et al.  Disfluency Detection Using a Bidirectional LSTM , 2016, INTERSPEECH.

[31]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[32]  Mark Johnson,et al.  Disfluency Detection using Auto-Correlational Neural Networks , 2018, EMNLP.

[33]  Sebastian Riedel,et al.  Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection , 2018, EMNLP.

[34]  Jordi Adell,et al.  Disfluent Speech Analysis and Synthesis: a preliminary approach. , 2006 .

[35]  Gökhan Tür,et al.  Automatic disfluency removal for improving spoken language translation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[36]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016 .

[37]  Kirsty McDougall,et al.  Profiling fluency: An analysis of individual variation in disfluencies in adult males , 2017, Speech Commun..

[38]  Mark Johnson,et al.  Improving Disfluency Detection by Self-Training a Self-Attentive Model , 2020, ACL.

[39]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[40]  Wanxiang Che,et al.  A Neural Attention Model for Disfluency Detection , 2016, COLING.

[41]  Mark Johnson,et al.  The impact of language models and loss functions on repair disfluency detection , 2011, ACL.

[42]  C H Nakatani,et al.  A corpus-based study of repair cues in spontaneous speech. , 1994, The Journal of the Acoustical Society of America.