论文信息 - Calculating the Upper Bounds for Multi-Document Summarization using Genetic Algorithms - 字舞流文

Calculating the Upper Bounds for Multi-Document Summarization using Genetic Algorithms

Over the last years, several Multi-Document Summarization (MDS) methods have been presented in Document Understanding Conference (DUC) workshops. Since DUC01, several methods have been presented in approximately 268 publications of the state-of-the-art, that have allowed the continuous improvement of MDS, however in most works the upper bounds were unknowns. Recently, some works have been focused to calculate the best sentence combinations of a set of documents and in previous works we have been calculated the significance for single-document summarization task in DUC01 and DUC02 datasets. However, for MDS task has not performed an analysis of significance to rank the best multi-document summarization methods. In this paper, we propose a method based on Genetic Algorithms for calculating the best sentence combinations of DUC01 and DUC02 datasets in MDS through a meta-document representation. Moreover, we have calculated three heuristics mentioned in several works of state-of-the-art to rank the most recent MDS methods, through the calculus of upper bounds and lower bounds.

Yulia Ledeneva | René Arnulfo García-Hernández | Jonathan Rojas Simón | Yulia Ledeneva | Jonathan Rojas Simón

[1] Josef Steinberger,et al. Sentence Compression for the LSA-based Summarizer , 2006 .

[2] Naomie Salim,et al. GENETIC ALGORITHM BASED SENTENCE EXTRACTION FOR TEXT SUMMARIZATION , 2011 .

[3] Rene Arnulfo Garcia Hernandez,et al. Generación automática de resúmenes - Retos, propuestas y experimentos , 2017 .

[4] Yihong Gong,et al. Integrating Document Clustering and Multidocument Summarization , 2011, TKDD.

[5] Sun Park,et al. Automatic generic document summarization based on non-negative matrix factorization , 2009, Inf. Process. Manag..

[6] Ryan T. McDonald. A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[7] Rada Mihalcea,et al. A Language Independent Algorithm for Single and Multiple Document Summarization , 2005, IJCNLP.

[8] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[9] Elena Lloret,et al. Text summarisation in progress: a literature review , 2011, Artificial Intelligence Review.

[10] Peter Norvig,et al. Inteligencia Artificial: un Enfoque Moderno , 2013 .

[11] Ming Zhou,et al. Ranking with Recursive Neural Networks and Its Application to Multi-Document Summarization , 2015, AAAI.

[12] Yogesh Kumar Meena,et al. Evolutionary Algorithms for Extractive Automatic Text Summarization , 2015 .

[13] Chris H. Q. Ding,et al. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization , 2008, SIGIR '08.

[14] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[15] Eduard Hovy,et al. The Potential and Limitations of Automatic Sentence Extraction for Summarization , 2003, HLT-NAACL 2003.

[16] Qin Lu,et al. Applying regression models to query-focused multi-document summarization , 2011, Inf. Process. Manag..

[17] Xin Liu,et al. Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[18] Hans Peter Luhn,et al. The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[19] Tibor Kiss,et al. Unsupervised Multilingual Sentence Boundary Detection , 2006, CL.

[20] Chris H. Q. Ding,et al. Weighted Feature Subset Non-negative Matrix Factorization and Its Applications to Document Understanding , 2010, 2010 IEEE International Conference on Data Mining.

[21] Stephan Oepen,et al. Sentence Boundary Detection: A Long Solved Problem? , 2012, COLING.

[22] Rakesh M. Verma,et al. Extractive Summarization: Limits, Compression, Generalized Model and Heuristics , 2017, Computación y Sistemas.

[23] Dragomir R. Radev,et al. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[24] Dorothea Heiss-Czedik,et al. An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[25] Z. Li,et al. How far we can go with extractive text summarization? Heuristic methods to obtain near upper bounds , 2017, Expert Syst. Appl..

[26] Yulia Ledeneva,et al. Calculating the significance of automatic extractive text summarization using a genetic algorithm , 2018, J. Intell. Fuzzy Syst..

[27] Eric SanJuan,et al. Summary Evaluation with and without References , 2010, Polytech. Open Libr. Int. Bull. Inf. Technol. Sci..

[28] Rasim M. Alguliyev,et al. Multiple documents summarization based on evolutionary optimization algorithm , 2013, Expert Syst. Appl..

[29] Enrique Alfonseca,et al. Generating Extracts with Genetic Algorithms , 2003, ECIR.

[30] K. Srinathan,et al. Using Graph Based Mapping of Co-occurring Words and Closeness Centrality Score for Summarization Evaluation , 2012, CICLing.

[31] Elizabeth León Guzman,et al. Extractive single-document summarization based on genetic operators and guided local search , 2014, Expert Syst. Appl..

[32] Enrique Herrera-Viedma,et al. A New Memetic Algorithm for Multi-document Summarization Based on CHC Algorithm and Greedy Search , 2014, MICAI.

[33] Alexander Gelbukh,et al. Comparing Commercial Tools and State-of-the-Art Methods for Generating Text Summaries , 2009, 2009 Eighth Mexican International Conference on Artificial Intelligence.

[34] Yulia Ledeneva,et al. Single Extractive Text Summarization Based on a Genetic Algorithm , 2013, MCPR.

[35] H. P. Edmundson,et al. New Methods in Automatic Extracting , 1969, JACM.

[36] Xiaojun Wan,et al. Multi-document summarization using cluster-based link analysis , 2008, SIGIR '08.

[37] Elena Lloret,et al. Quantifying the Limits and Success of Extractive Summarization Systems Across Domains , 2010, HLT-NAACL.

[38] Rafael Dueire Lins,et al. A multi-document summarization system based on statistics and linguistic treatment , 2014, Expert Syst. Appl..

[39] Mohamed Abdel Fattah. A hybrid machine learning model for multi-document summarization , 2013, Applied Intelligence.

[40] Alexander F. Gelbukh,et al. Terms Derived from Frequent Sequences for Extractive Text Summarization , 2008, CICLing.

[41] Manuel J. Maña López,et al. Generación automática de resümenes personalizados , 2001, Proces. del Leng. Natural.

[42] GambhirMahak,et al. Recent automatic text summarization techniques , 2017 .

[43] Dragomir R. Radev,et al. Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..