Salience Estimation via Variational Auto-Encoders for Multi-Document Summarization

We propose a new unsupervised sentence salience framework for Multi-Document Summarization (MDS), which can be divided into two components: latent semantic modeling and salience estimation. For latent semantic modeling, a neural generative model called Variational Auto-Encoders (VAEs) is employed to describe the observed sentences and the corresponding latent semantic representations. Neural variational inference is used for the posterior inference of the latent variables. For salience estimation, we propose an unsupervised data reconstruction framework, which jointly considers the reconstruction for latent semantic space and observed term vector space. Therefore, we can capture the salience of sentences from these two different and complementary vector spaces. Thereafter, the VAEs-based latent semantic model is integrated into the sentence salience estimation component in a unified fashion, and the whole framework can be trained jointly by back-propagation via multi-task learning. Experimental results on the benchmark datasets DUC and TAC show that our framework achieves better performance than the state-of-the-art models.

[1]  Chun Chen,et al.  Document Summarization Based on Data Reconstruction , 2012, AAAI.

[2]  Jaime Carbonell,et al.  Multi-Document Summarization By Sentence Extraction , 2000 .

[3]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[4]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[5]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[6]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[7]  Bo Zhang,et al.  Learning to Generate with Memory , 2016, ICML.

[8]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[9]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[10]  George B. Dantzig,et al.  Linear Programming 1: Introduction , 1997 .

[11]  M. de Rijke,et al.  Summarizing Answers in Non-Factoid Community Question-Answering , 2017, WSDM.

[12]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[13]  M. de Rijke,et al.  Using Sparse Coding for Answer Summarization in Non-Factoid Community Question-Answering , 2016 .

[14]  Piji Li,et al.  Abstractive Multi-Document Summarization via Phrase Selection and Merging , 2015, ACL.

[15]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[16]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[17]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[18]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[19]  Michael I. Jordan,et al.  A generalized mean field algorithm for variational inference in exponential families , 2002, UAI.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Ani Nenkova,et al.  A Survey of Text Summarization Techniques , 2012, Mining Text Data.

[22]  Xiaojun Wan,et al.  Compressive Document Summarization via Sparse Optimization , 2015, IJCAI.

[23]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[24]  Xiaojun Wan,et al.  Manifold-Ranking Based Topic-Focused Multi-Document Summarization , 2007, IJCAI.

[25]  Xiaojun Wan,et al.  PKUTM participation in TAC2011 , 2011 .

[26]  Hang Li,et al.  Reader-Aware Multi-Document Summarization via Sparse Coding , 2015, IJCAI.

[27]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[28]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[29]  Claire Cardie,et al.  A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization , 2013, ACL.

[30]  Mirella Lapata,et al.  Multiple Aspect Summarization Using Integer Linear Programming , 2012, EMNLP.

[31]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[32]  Chew Lim Tan,et al.  Exploiting Category-Specific Information for Multi-Document Summarization , 2012, COLING.

[33]  Mark Wasson,et al.  Using Leading Text for News Summaries: Evaluation Results and Implications for Commercial Summarization Applications , 1998, ACL.

[34]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[35]  He Liu,et al.  Multi-Document Summarization Based on Two-Level Sparse Representation Model , 2015, AAAI.

[36]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..