论文信息 - Controlling Output Length in Neural Encoder-Decoders

Controlling Output Length in Neural Encoder-Decoders

Neural encoder-decoder models have shown great success in many sequence generation tasks. However, previous work has not investigated situations in which we would like to control the length of encoder-decoder outputs. This capability is crucial for applications such as text summarization, in which we have to generate concise summaries with a desired length. In this paper, we propose methods for controlling the output sequence length for neural encoder-decoder models: two decoding-based methods and two learning-based methods. Results show that our learning-based methods have the capability to control length without degrading summary quality in a summarization task.

[1] Jianfeng Gao,et al. A Persona-Based Neural Conversation Model , 2016, ACL.

[2] Navdeep Jaitly,et al. Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[3] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[4] Bowen Zhou,et al. Sequence-to-Sequence RNNs for Text Summarization , 2016, ArXiv.

[5] Geoffrey E. Hinton,et al. Grammar as a Foreign Language , 2014, NIPS.

[6] Zhiyuan Liu,et al. Neural Headline Generation with Minimum Risk Training , 2016, ArXiv.

[7] Emiel Krahmer,et al. Sentence Simplification by Monolingual Machine Translation , 2012, ACL.

[8] Ion Androutsopoulos,et al. An extractive supervised two-stage method for sentence compression , 2010, NAACL.

[9] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[10] Bowen Zhou,et al. Pointing the Unknown Words , 2016, ACL.

[11] Yasemin Altun,et al. Overcoming the Lack of Parallel Data in Sentence Compression , 2013, EMNLP.

[12] Richard M. Schwartz,et al. BBN/UMD at DUC-2004: Topiary , 2004 .

[13] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[14] Mark T. Maybury,et al. Automatic Summarization , 2002, Computational Linguistics.

[15] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16] Konstantin Lopyrev,et al. Generating News Headlines with Recurrent Neural Networks , 2015, ArXiv.

[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[18] Rico Sennrich,et al. Controlling Politeness in Neural Machine Translation via Side Constraints , 2016, NAACL.

[19] Lukasz Kaiser,et al. Sentence Compression by Deletion with LSTMs , 2015, EMNLP.

[20] Mirella Lapata,et al. An abstractive approach to sentence compression , 2013, TIST.

[21] Yansong Feng,et al. Title Generation with Quasi-Synchronous Grammar , 2010, EMNLP.

[22] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23] Alexander M. Rush,et al. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[24] Wojciech Zaremba,et al. An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[25] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.

[26] Benjamin Van Durme,et al. Annotated Gigaword , 2012, AKBC-WEKEX@NAACL-HLT.

[27] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[28] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[29] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[30] Michele Banko,et al. Headline Generation Based on Statistical Translation , 2000, ACL.

[31] Kenta Oono,et al. Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[32] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[33] Michael Strube,et al. Dependency Tree Based Sentence Compression , 2008, INLG.

[34] Ronald Rosenfeld,et al. Whole-sentence exponential language models: a vehicle for linguistic-statistical integration , 2001, Comput. Speech Lang..

[35] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[36] Richard M. Schwartz,et al. Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation , 2003, HLT-NAACL 2003.

[37] Nancy Chinchor,et al. The Statistical Significance of the MUC-4 Results , 1992, MUC.

[38] Hang Li,et al. “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[39] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[40] Mirella Lapata,et al. Sentence Compression Beyond Word Deletion , 2008, COLING.

[41] David Vandyke,et al. Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[42] Chris Callison-Burch,et al. Paraphrastic Sentence Compression with a Character-based Metric: Tightening without Deletion , 2011, Monolingual@ACL.

[43] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[44] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).