QuickEdit: Editing Text & Translations via Simple Delete Actions

We propose a framework for computer-assisted text editing. It applies to translation post-editing and to paraphrasing and relies on very simple interactions: a human editor modifies a sentence by marking tokens they would like the system to change. Our model then generates a new sentence which reformulates the initial sentence by avoiding the words from the marked tokens. Our approach builds upon neural sequence-to-sequence modeling and introduces a neural network which takes as input a sentence along with deleted token markers. Our model is trained on translation bi-text by simulating post-edits. Our results on post-editing for machine translation and paraphrasing evaluate the performance of our approach. We show +11.4 BLEU with limited post-editing effort on the WMT-14 English-German translation task (25.2 to 36.6), which represents +5.9 BLEU over the post-editing baseline (30.7 to 36.6).

[1]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[2]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[3]  Peter Haider,et al.  Predicting Sentences using N-Gram Language Models , 2005, HLT.

[4]  Philipp Koehn,et al.  A Web-Based Interactive Computer Aided Translation Tool , 2009, ACL.

[5]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[6]  Percy Liang,et al.  Generating Sentences by Editing Prototypes , 2017, TACL.

[7]  L. Bottou Stochastic Gradient Learning in Neural Networks , 1991 .

[8]  Eric Brill,et al.  An Improved Error Model for Noisy Channel Spelling Correction , 2000, ACL.

[9]  Hermann Ney,et al.  Statistical Approaches to Computer-Assisted Translation , 2009, CL.

[10]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[13]  Benjamin Marie,et al.  Touch-Based Pre-Post-Editing of Machine Translation Output , 2015, EMNLP.

[14]  eon BottouAT Stochastic Gradient Learning in Neural Networks , 2022 .

[15]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[16]  Chris Callison-Burch,et al.  Paraphrasing with Bilingual Parallel Corpora , 2005, ACL.

[17]  Chris Quirk,et al.  Monolingual Machine Translation for Paraphrase Generation , 2004, EMNLP.

[18]  Jindrich Libovický,et al.  Attention Strategies for Multi-Source Sequence-to-Sequence Learning , 2017, ACL.

[19]  Philippe Langlais,et al.  TransType: a Computer-Aided Translation Typing System , 2000 .

[20]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[21]  Edgar T. Irons,et al.  A CRT editing system , 1972, CACM.

[22]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[23]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[24]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.

[25]  Kevin Gimpel,et al.  Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations , 2017, ArXiv.

[26]  Marcello Federico,et al.  Report on the 11th IWSLT evaluation campaign , 2014, IWSLT.

[27]  Marion Weller,et al.  Exploring the Planet of the APEs: a Comparative Study of State-of-the-art Methods for MT Automatic Post-Editing , 2015, ACL.

[28]  Jeffrey Heer,et al.  Human Effort and Machine Learnability in Computer Aided Translation , 2014, EMNLP.

[29]  Francisco Casacuberta,et al.  Statistical Post-Editing of a Rule-Based Machine Translation System , 2009, NAACL.

[30]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.