Fuzzy Genetic Semantic Based Text Summarization

Automatic text summarization is a data reduction process to exclude unnecessary details and present important information in a shorter version. One way to summarize document is by extracting important sentences in the document. To select suitable sentences, a numerical rank is assigned to each sentence based on a sentence scoring approach. Highly ranked sentences are used for the summary. This paper proposed an automatic text summarization approach based on sentence extraction using fuzzy logic, genetic algorithm, semantic role labeling and their combinations to generate high quality summaries. This study explored the benefits of the genetic algorithm in the optimization problem in for feature selection during the training phase and adjusts feature weights during the testing phase. Fuzzy IF-THEN rules were used to balance the weights between important and unimportant features. Conventional extraction methods cannot capture semantic relations between concepts in a text. Therefore, this research investigates the use of the semantic role labeling to capture the semantic contents in sentences and incorporate it into the summarization method. This paper is evaluated in terms of performance using ROUGE toolkit. Experimental results showed that the summaries produced by the proposed approaches are better than other approaches produced by Microsoft Word 2007, Copernic Summarizer, and MANYASPECTS summarizers.

[1]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[2]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[3]  Hwee Tou Ng,et al.  Domain adaptation for semantic role labeling in the biomedical domain , 2010, Bioinform..

[4]  Youngjoong Ko,et al.  An effective sentence-extraction technique using contextual information and statistical approaches for text summarization , 2008, Pattern Recognition Letters.

[5]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[6]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[7]  M. Klamer Describing morphosyntax: A guide for field linguists , 2000 .

[8]  Chin-Yew Lin Training a selection function for extraction , 1999, CIKM '99.

[9]  Fuji Ren,et al.  GA, MR, FFNN, PNN and GMM based models for automatic text summarization , 2009, Comput. Speech Lang..

[10]  Jason Weston,et al.  Large Scale Application of Neural Network Based Semantic Role Labeling for Automated Relation Extraction from Biomedical Texts , 2009, PloS one.

[11]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[12]  Cem Aksoy,et al.  Semantic argument frequency-based multi-document summarization , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[13]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[14]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[15]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[16]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[17]  Ramiz M. Aliguliyev,et al.  A new sentence similarity measure and sentence based extractive technique for automatic text summarization , 2009, Expert Syst. Appl..

[18]  Alexander F. Gelbukh,et al.  Terms Derived from Frequent Sequences for Extractive Text Summarization , 2008, CICLing.

[19]  Wei-Pang Yang,et al.  Text summarization using a trainable summarizer and latent semantic analysis , 2005, Inf. Process. Manag..

[20]  Naomie Salim,et al.  Fuzzy Logic Based Method for Improving Text Summarization , 2009, ArXiv.

[21]  Esfandiar Eslami,et al.  Optimizing Text Summarization Based on Fuzzy Logic , 2008, Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008).

[22]  Wei Song,et al.  Genetic algorithm for text clustering based on latent semantic indexing , 2009, Comput. Math. Appl..

[23]  René Witte,et al.  Fuzzy Coreference Resolution for Summarization , 2003 .

[24]  George M. Kasper,et al.  The Effects and Limitations of Automated Text Condensing on Reading Comprehension Performance , 1992, Inf. Syst. Res..

[25]  Agma J. M. Traina,et al.  Genetic algorithms for approximate similarity queries , 2007, Data Knowl. Eng..

[26]  Elizabeth D. Liddy,et al.  Advances in Automatic Text Summarization , 2001, Information Retrieval.

[27]  Mohammad-R. Akbarzadeh-T,et al.  Automatic Text Summarization Using Hybrid Fuzzy GA-GP , 2006 .

[28]  Arman Kiani,et al.  Automatic Text Summarization Using Hybrid Fuzzy GA-GP , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[29]  Sanda M. Harabagiu,et al.  Topic themes for multi-document summarization , 2005, SIGIR '05.