论文信息 - Language-independent multi-document text summarization with document-specific word associations

Language-independent multi-document text summarization with document-specific word associations

The goal of automatic text summarization is to generate an abstract of a document or a set of documents. In this paper we propose a word association based method for generating summaries in a variety of languages. We show that a robust statistical method for finding associations which are specific to the given document(s) is applicable to many languages. We introduce strategies that utilize the discovered associations to effectively select sentences from the document(s) to constitute the summary. Empirical results indicate that the method works reliably in a relatively large set of languages and outperforms methods reported in MultiLing 2013.

Hannu Toivonen | Antoine Doucet | Oskar Gross

[1] George Giannakopoulos,et al. AutoSummENG and MeMoG in Evaluating Guided Summaries , 2011, TAC.

[2] David S. Johnson,et al. Approximation algorithms for combinatorial problems , 1973, STOC.

[3] ELENA BARALIS,et al. MWI-Sum: A Multilingual Summarizer Based on Frequent Weighted Itemsets , 2015, TOIS.

[4] Dragomir R. Radev,et al. Generating summaries of multiple news articles , 1995, SIGIR '95.

[5] L. Freeman. Centrality in social networks conceptual clarification , 1978 .

[6] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[7] Chun Chen,et al. Document Summarization Based on Data Reconstruction , 2012, AAAI.

[8] Eduard H. Hovy,et al. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[9] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[10] Dilek Z. Hakkani-Tür,et al. Discovery of Topically Coherent Sentences for Extractive Summarization , 2011, ACL.

[11] Ted Dunning,et al. Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.