Generating summary-sentences with preserved meaning is important for the summarization of longer documents. Length control of summary-sentences is challenging as sentences cannot simply be cut at the desired length; they must be complete and preserve input meaning. We propose a modular framework for length control of generated sentences: based on sequence-to-sequence models, powered by a two-stage training process involving a summarizer that is trained without explicit length control and a stylizer that is fine-tuned on the output of the summarizer. Our solution achieves the performance of existing models for controlling generated sentence length but light in implementation and model complexity. Automatically generated, accurate summaries are becoming critically important due to the sheer amount of information available on any given subject. Text summarization is a process to achieve this end by rewriting a sentence or a paragraph into a shorter one while retaining the meaning. Summarization is classified as either extractive, where the output sentence is comprised directly from fragments of the input, or as abstractive, where the generated output contains text that may not necessarily be present in the input. Abstractive summarization is more natural and closer to what a human would do: humans can paraphrase a story or an article and express it in their own words. A number of applications of summarization today impose a length limit on the desired output. For example, a reader may have very limited time and would prefer a shorter output. Or alternatively, the summary may be broadcast through a service with a hard character limit (e.g. Twitter, SMS). Summarization tasks can vary from the summary of a single document to the summary of multiple documents, with summaries varying in length. Our focus is on abstractive summarization at the sentence level. Neural sequence-to-sequence models have proven to be successful in abstractive summarization. Until recently, these methods have not offered any means of explicit length control; allowing the model to implicitly define the length through the decoding process (e.g. beam search). Recent works have shown that explicit length control is possible through different architectural modifications. In this ESANN 2020 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Online event, 2-4 October 2020, i6doc.com publ., ISBN 978-2-87587-074-2. Available from http://www.i6doc.com/en/.
[1]
Razvan Pascanu,et al.
Understanding the exploding gradient problem
,
2012,
ArXiv.
[2]
Yoshua Bengio,et al.
Neural Machine Translation by Jointly Learning to Align and Translate
,
2014,
ICLR.
[3]
Quoc V. Le,et al.
Sequence to Sequence Learning with Neural Networks
,
2014,
NIPS.
[4]
Marilyn A. Walker,et al.
A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation
,
2018,
NAACL.
[5]
Angela Fan,et al.
Controllable Abstractive Summarization
,
2017,
NMT@ACL.
[6]
Piji Li,et al.
Deep Recurrent Generative Decoder for Abstractive Text Summarization
,
2017,
EMNLP.
[7]
Bowen Zhou,et al.
Sequence-to-Sequence RNNs for Text Summarization
,
2016,
ArXiv.
[8]
Yoram Singer,et al.
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
,
2011,
J. Mach. Learn. Res..
[9]
Graham Neubig,et al.
Controlling Output Length in Neural Encoder-Decoders
,
2016,
EMNLP.
[10]
Matthew D. Zeiler.
ADADELTA: An Adaptive Learning Rate Method
,
2012,
ArXiv.
[11]
Chin-Yew Lin,et al.
ROUGE: A Package for Automatic Evaluation of Summaries
,
2004,
ACL 2004.
[12]
Benjamin Van Durme,et al.
Annotated Gigaword
,
2012,
AKBC-WEKEX@NAACL-HLT.