Encode, Tag, Realize: High-Precision Text Editing

We propose LaserTagger - a sequence tagging approach that casts text generation as a text editing task. Target texts are reconstructed from the inputs using three main edit operations: keeping a token, deleting it, and adding a phrase before the token. To predict the edit operations, we propose a novel model, which combines a BERT encoder with an autoregressive Transformer decoder. This approach is evaluated on English text on four tasks: sentence fusion, sentence splitting, abstractive summarization, and grammar correction. LaserTagger achieves new state-of-the-art results on three of these tasks, performs comparably to a set of strong seq2seq baselines with a large number of training examples, and outperforms them when the number of examples is limited. Furthermore, we show that at inference time tagging can be more than two orders of magnitude faster than comparable seq2seq models, making it more attractive for running in a live environment.

[1]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[2]  Kevin Knight,et al.  Automated Postediting of Documents , 1994, AAAI.

[3]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[4]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[5]  Idan Szpektor,et al.  DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion , 2019, NAACL.

[6]  Wei Zhao,et al.  Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data , 2019, NAACL.

[7]  Daniel Jurafsky,et al.  Detecting Institutional Dialog Acts in Police Traffic Stops , 2018, TACL.

[8]  Michael Strube,et al.  Dependency Tree Based Sentence Compression , 2008, INLG.

[9]  Franck Dernoncourt,et al.  A Repository of Corpora for Summarization , 2018, LREC.

[10]  Ming Zhou,et al.  Fluency Boost Learning and Inference for Neural Grammatical Error Correction , 2018, ACL.

[11]  E. Kochmar,et al.  Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data , 2019 .

[12]  Ido Dagan,et al.  Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation , 2019, NAACL.

[13]  Ted Briscoe,et al.  The BEA-2019 Shared Task on Grammatical Error Correction , 2019, BEA@ACL.

[14]  J. Clarke,et al.  Global inference for sentence compression : an integer linear programming approach , 2008, J. Artif. Intell. Res..

[15]  Mirella Lapata,et al.  Ranking Sentences for Extractive Summarization with Reinforcement Learning , 2018, NAACL.

[16]  Hwee Tou Ng,et al.  The CoNLL-2013 Shared Task on Grammatical Error Correction , 2013, CoNLL Shared Task.

[17]  Marcin Junczys-Dowmunt,et al.  Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task , 2018, NAACL.

[18]  Manaal Faruqui,et al.  Learning To Split and Rephrase From Wikipedia Edit History , 2018, EMNLP.

[19]  Shamil Chollampatt,et al.  Neural Quality Estimation of Grammatical Error Correction , 2018, EMNLP.

[20]  Mirella Lapata,et al.  Sentence Compression Beyond Word Deletion , 2008, COLING.

[21]  Marcin Junczys-Dowmunt,et al.  The AMU System in the CoNLL-2014 Shared Task: Grammatical Error Correction by Data-Intensive and Feature-Rich Statistical Machine Translation , 2014, CoNLL Shared Task.

[22]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[23]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[24]  Marek Rei,et al.  Semi-supervised Multitask Learning for Sequence Labeling , 2017, ACL.

[25]  Ted Briscoe,et al.  Artificial Error Generation with Machine Translation and Syntactic Patterns , 2017, BEA@EMNLP.

[26]  Kristina Toutanova,et al.  A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs , 2016, EMNLP.

[27]  Staal A. Vinterbo,et al.  A note on the hardness of the k-ambiguity problem , 2002 .

[28]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[29]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[30]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[31]  Mirella Lapata,et al.  Sentence Simplification with Deep Reinforcement Learning , 2017, EMNLP.

[32]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[33]  Mirella Lapata,et al.  Data-to-Text Generation with Content Selection and Planning , 2018, AAAI.

[34]  Xiaojun Wan,et al.  Abstractive Document Summarization with a Graph-Based Attentional Neural Model , 2017, ACL.

[35]  Jonas Mueller,et al.  IMaT: Unsupervised Text Attribute Transfer via Iterative Matching and Translation , 2019, EMNLP/IJCNLP.

[36]  Chris Callison-Burch,et al.  Optimizing Statistical Machine Translation for Text Simplification , 2016, TACL.

[37]  Jackie Chi Kit Cheung,et al.  EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing , 2019, ACL.

[38]  Richard H. R. Hahnloser,et al.  Large-scale Hierarchical Alignment for Author Style Transfer , 2018, ArXiv.

[39]  Sebastian Riedel,et al.  Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection , 2018, EMNLP.

[40]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[41]  Nizar Habash,et al.  The Illinois-Columbia System in the CoNLL-2014 Shared Task , 2014, CoNLL Shared Task.

[42]  Alexander M. Rush,et al.  Learning Neural Templates for Text Generation , 2018, EMNLP.

[43]  Yang Liu,et al.  Fine-tune BERT for Extractive Summarization , 2019, ArXiv.

[44]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[45]  Joel R. Tetreault,et al.  Dear Sir or Madam, May I Introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer , 2018, NAACL.

[46]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.