Using Genetic Algorithms with Lexical Chains for Automatic Text Summarization

Automatic text summarization takes an input text and extracts the most important content in the text. Determining the importance of information depends on several factors. In this paper, we combine two different approaches that have been used in the text summarization domain. The first one is using genetic algorithms to learn the patterns in the documents that lead to the summaries. The other one is using lexical chains as a representation of the lexical cohesion that exists throughout the text. We propose a novel approach that incorporates lexical chains into the model as a feature and learns the feature weights using genetic algorithms. The experiments performed on the CAST corpus showed that combining different classes of features and also including lexical chains outperform the classical approaches.

[1]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[2]  Antonio Zamora,et al.  Automatic Abstracting Research at Chemical Abstracts Service , 1975, J. Chem. Inf. Comput. Sci..

[3]  Kathleen R. McKeown,et al.  Summarization Evaluation Methods: Experiments and Analysis , 1998 .

[4]  Kenneth DeJong,et al.  Learning with genetic algorithms: An overview , 1988, Machine Learning.

[5]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[6]  D. Shinar BEN-GURION UNIVERSITY OF THE NEGEV , 2012 .

[7]  Seiji Miike,et al.  A full-text retrieval system with a dynamic abstract generation function , 1994, SIGIR '94.

[8]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[9]  K. D. Jong Learning with Genetic Algorithms: An Overview , 2005, Machine Learning.

[10]  K. Roberts,et al.  Thesis , 2002 .

[11]  Mark T. Maybury,et al.  Generating Summaries from Event Data , 1995, Inf. Process. Manag..

[12]  Chris D. Paice,et al.  The identification of important concepts in highly structured technical papers , 1993, SIGIR.

[13]  Lisa F. Rau,et al.  Automatic Condensation of Electronic Publications by Sentence Selection , 1995, Inf. Process. Manag..

[14]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[15]  Inderjeet Mani,et al.  Machine Learning of Generic and User-Focused Summarization , 1998, AAAI/IAAI.

[16]  Maria Fuentes,et al.  Using cohesive properties of text for automatic summarization , 2002 .

[17]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[18]  Jing Li,et al.  A Query-Focused Multi-Document Summarizer Based on Lexical Chains , 2007 .

[19]  Yllias Chali,et al.  Text Summarization Using Lexical Chains , 2001 .

[20]  Enrique Alfonseca,et al.  Generating Extracts with Genetic Algorithms , 2003, ECIR.

[21]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[22]  Hamid Khosravi,et al.  Text Summarization Based on Genetic Programming , 2009 .

[23]  Fuji Ren,et al.  GA, MR, FFNN, PNN and GMM based models for automatic text summarization , 2009, Comput. Speech Lang..

[24]  Gerald DeJong,et al.  An Overview of the FRUMP System Introduction , 2014 .

[25]  Arman Kiani,et al.  Automatic Text Summarization Using Hybrid Fuzzy GA-GP , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[26]  Giovanni Guida,et al.  Evaluating Importance: A Step Towards Text Summarization , 1985, IJCAI.

[27]  Regina Barzilay,et al.  Lexical Chains for Summarization , 1997 .

[28]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[29]  E. F. Skorochod'ko Adaptive Method of Automatic Abstracting and Indexing , 1971, IFIP Congress.

[30]  Kathleen F. McCoy,et al.  Efficient text summarization using lexical chains , 2000, IUI '00.

[31]  Mohammad-R. Akbarzadeh-T,et al.  Automatic Text Summarization Using Hybrid Fuzzy GA-GP , 2006 .

[32]  Jihoon Yang,et al.  Extracting sentence segments for text summarization: a machine learning approach , 2000, SIGIR '00.