Using External Resources and Joint Learning for Bigram Weighting in ILP-Based Multi-Document Summarization

Some state-of-the-art summarization systems use integer linear programming (ILP) based methods that aim to maximize the important concepts covered in the summary. These concepts are often obtained by selecting bigrams from the documents. In this paper, we improve such bigram based ILP summarization methods from different aspects. First we use syntactic information to select more important bigrams. Second, to estimate the importance of the bigrams, in addition to the internal features based on the test documents (e.g., document frequency, bigram positions), we propose to extract features by leveraging multiple external resources (such as word embedding from additional corpus, Wikipedia, Dbpedia, WordNet, SentiWordNet). The bigram weights are then trained discriminatively in a joint learning model that predicts the bigram weights and selects the summary sentences in the ILP framework at the same time. We demonstrate that our system consistently outperforms the prior ILP method on different TAC data sets, and performs competitively compared to other previously reported best results. We also conducted various analyses to show the contributions of different components.

[1]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[2]  Xiaojun Wan,et al.  PKUTM Participation at TAC 2010 RTE and Summarization Track , 2010, TAC.

[3]  Ian H. Witten,et al.  Human-competitive tagging using automatic keyphrase extraction , 2009, EMNLP.

[4]  Laurent Romary,et al.  HUMB: Automatic Key Term Extraction from Scientific Articles in GROBID , 2010, *SEMEVAL.

[5]  Joshua Goodman,et al.  Finding advertising keywords on web pages , 2006, WWW '06.

[6]  Min-Yen Kan,et al.  Role of Verbs in Document Analysis , 1998, ACL.

[7]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[8]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[9]  John M. Conroy,et al.  OCCAMS -- An Optimal Combinatorial Covering Algorithm for Multi-document Summarization , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[10]  Kai Hong,et al.  Improving the Estimation of Word Importance for News Multi-Document Summarization , 2014, EACL.

[11]  Hiroya Takamura,et al.  Text Summarization Model Based on Maximum Coverage Problem and its Variant , 2009, EACL.

[12]  K. Srinathan,et al.  An Effective Approach for AESOP and Guided Summarization Task , 2010, TAC.

[13]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[14]  Chew Lim Tan,et al.  Exploiting Category-Specific Information for Multi-Document Summarization , 2012, COLING.

[15]  Noah A. Smith,et al.  Summarization with a Joint Model for Sentence Extraction and Compression , 2009, ILP 2009.

[16]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[17]  Mirella Lapata,et al.  Multiple Aspect Summarization Using Integer Linear Programming , 2012, EMNLP.

[18]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[19]  Vincent Ng,et al.  Automatic Keyphrase Extraction: A Survey of the State of the Art , 2014, ACL.

[20]  Wei Gao,et al.  Utilizing Microblogs for Automatic News Highlights Extraction , 2014, COLING.

[21]  Dilek Z. Hakkani-Tür,et al.  The ICSI Summarization System at TAC 2008 , 2008, TAC.

[22]  Hui Lin,et al.  Multi-document Summarization via Budgeted Maximization of Submodular Functions , 2010, NAACL.

[23]  Dilek Z. Hakkani-Tür,et al.  The ICSI/UTD Summarization System at TAC 2009 , 2009, TAC.

[24]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[25]  Lin Zhao,et al.  Improving Multi-documents Summarization by Sentence Compression based on Expanded Constituent Parse Trees , 2014, EMNLP.

[26]  Yang Liu,et al.  Using Supervised Bigram-based ILP for Extractive Summarization , 2013, ACL.

[27]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[28]  Fei Liu,et al.  Document Summarization via Guided Sentence Compression , 2013, EMNLP.

[29]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[30]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[31]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[32]  John M. Conroy,et al.  An Assessment of the Accuracy of Automatic Evaluation in Summarization , 2012, EvalMetrics@NAACL-HLT.