In this paper, we present SEMMR, a novel subtopic-enriched sentence ranking method for Chinese multi-document summarization derived from Maximal Marginal Relevance (MMR). MMR is one of the most popular ranking algorithms for balancing the topical relevance and content redundancy in a unified framework, which has been well employed in the context of text retrieval and document summarization. For multi-document summarization task, existing MMR-based approaches usually directly incorporate the topical relevance between each sentence and the main topic into the sentence ranking process while ignoring the latent subtopic information of finer granularity. Actually, a document set on a main topic usually consists of a few implicit subtopics, and different subtopic may have unequal impact on the sentence ranking. Specifically, the sentences having higher proximity with the subtopics close to the main topic are deemed more relevant than the sentences related with the subtopics far away from the main topic. To address this issue and take into account the subtopic's impact on sentence ranking performance, this paper extends the traditional MMR algorithm by integrating the sub-topical relevance as well as the sentence-to-subtopic proximity into the unified ranking process. Preliminary experimental results indicate the effectiveness of our proposed methods.
[1]
Dragomir R. Radev,et al.
Centroid-based summarization of multiple documents
,
2004,
Inf. Process. Manag..
[2]
Hua Li,et al.
Document Summarization Using Conditional Random Fields
,
2007,
IJCAI.
[3]
Dragomir R. Radev,et al.
LexPageRank: Prestige in Multi-Document Text Summarization
,
2004,
EMNLP.
[4]
Xin Liu,et al.
Generic text summarization using relevance measure and latent semantic analysis
,
2001,
SIGIR '01.
[5]
Eduard H. Hovy,et al.
From Single to Multi-document Summarization
,
2002,
ACL.
[6]
Eduard H. Hovy,et al.
Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics
,
2003,
NAACL.
[7]
Chin-Yew Lin,et al.
From Single to Multi-document Summarization : A Prototype System and its Evaluation
,
2002
.
[8]
Jade Goldstein-Stewart,et al.
The use of MMR, diversity-based reranking for reordering documents and producing summaries
,
1998,
SIGIR '98.
[9]
Rada Mihalcea,et al.
TextRank: Bringing Order into Text
,
2004,
EMNLP.
[10]
Joshua Goodman,et al.
Multi-Document Summarization by Maximizing Informative Content-Words
,
2007,
IJCAI.