论文信息 - Improving Conditioning in Context-Aware Sequence to Sequence Models - 字舞流文

Improving Conditioning in Context-Aware Sequence to Sequence Models

Neural sequence to sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence. In this work, we focus on cases where generation is conditioned on both a short query and a long context, such as abstractive question answering or document-level translation. We modify the standard sequence-to-sequence approach to make better use of both the query and the context by expanding the conditioning mechanism to intertwine query and context attention. We also introduce a simple and efficient data augmentation method for the proposed model. Experiments on three different tasks show that both changes lead to consistent improvements.

Jason Weston | Xinyi Wang | Michael Auli | Yacine Jernite | J. Weston | Xinyi Wang | Michael Auli | Yacine Jernite

[1] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[2] Kyunghyun Cho,et al. Larger-Context Language Modelling with Recurrent Neural Network , 2015, ACL.

[3] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.

[4] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[5] Chris Dyer,et al. The NarrativeQA Reading Comprehension Challenge , 2017, TACL.

[6] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[7] Mauro Cettolo,et al. WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.

[8] Jianfeng Gao,et al. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[9] Andrei Popescu-Belis,et al. Context in Neural Machine Translation: A Review of Models and Evaluations , 2019, ArXiv.

[10] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[11] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[12] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[13] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[14] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[15] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16] J. Koenderink. Q… , 2014, Les noms officiels des communes de Wallonie, de Bruxelles-Capitale et de la communaute germanophone.

[17] James Henderson,et al. Document-Level Neural Machine Translation with Hierarchical Attention Networks , 2018, EMNLP.

[18] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[19] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.

[20] Jason Weston,et al. Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[21] Huanbo Luan,et al. Improving the Transformer Translation Model with Document-Level Context , 2018, EMNLP.

[22] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[23] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[24] Yang Liu,et al. Learning to Remember Translation History with a Continuous Cache , 2017, TACL.

[25] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26] Jason Weston,et al. ELI5: Long Form Question Answering , 2019, ACL.

[27] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .