论文信息 - Multi-document Summarization Using Minimum Distortion

Multi-document Summarization Using Minimum Distortion

Document summarization plays an important role in the area of natural language processing and text mining. This paper proposes several novel information-theoretic models for multi-document summarization. They consider document summarization as a transmission system and assume that the best summary should have the minimum distortion. By defining a proper distortion measure and a new representation method, the combination of the last two models (the linear representation model and the facility location model) gains good experimental results on the DUC2002 and DUC2004 datasets. Moreover, we also indicate that the model has high interpretability and extensibility.

Xiaojun Wan | Tengfei Ma

[1] Kamesh Munagala,et al. Local search heuristic for k-median and facility location problems , 2001, STOC '01.

[2] Hans Peter Luhn,et al. The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[3] Naftali Tishby,et al. The Information Bottleneck Revisited or How to Choose a Good Distortion Measure , 2007, 2007 IEEE International Symposium on Information Theory.

[4] Dipanjan Das Andr,et al. A Survey on Automatic Text Summarization , 2007 .

[5] Rada Mihalcea,et al. A Language Independent Algorithm for Single and Multiple Document Summarization , 2005, IJCNLP.

[6] Jade Goldstein-Stewart,et al. The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[7] Hua Li,et al. Document Summarization Using Conditional Random Fields , 2007, IJCAI.

[8] Hongyuan Zha,et al. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering , 2002, SIGIR '02.

[9] CHENGXIANG ZHAI,et al. A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[10] Chong Long,et al. Multi-document Summarization by Information Distance , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[11] Kam-Fai Wong,et al. Extractive Summarization Using Supervised and Semi-Supervised Learning , 2008, COLING.