Mr&mr-Sum: Maximum Relevance and Minimum Redundancy Document Summarization Model

We have presented an approach to automatic document summarization. In the proposed approach, text summarization is modeled as a quadratic integer-programming problem. This model generally attempts to optimize three properties, namely, (1) relevance: summary should contain informative textual units that are relevant to the user; (2) redundancy: summaries should not contain multiple textual units that convey the same information; and (3) length: summary is bounded in length. To solve the optimization problem we have created a novel differential evolution algorithm. Experimental results on DUC2005 and DUC2007 data sets showed that the proposed approach outperforms the other methods.

[1]  Shafiq R. Joty,et al.  A SVM-Based Ensemble Approach to Multi-Document Summarization , 2009, Canadian Conference on AI.

[2]  Flora S. Tsai,et al.  Evaluation of novelty metrics for sentence-level novelty mining , 2010, Inf. Sci..

[3]  Xindong Wu,et al.  10 Challenging Problems in Data Mining Research , 2006, Int. J. Inf. Technol. Decis. Mak..

[4]  P. N. Suganthan,et al.  Differential Evolution Algorithm With Strategy Adaptation for Global Numerical Optimization , 2009, IEEE Transactions on Evolutionary Computation.

[5]  Rasim M. Alguliyev,et al.  Evolutionary Algorithm for Extractive Text Summarization , 2009, Intell. Inf. Manag..

[6]  Vasileios Hatzivassiloglou,et al.  A Formal Model for Information Selection in Multi-Sentence Text Extraction , 2004, COLING.

[7]  Dragomir R. Radev,et al.  Biased LexRank: Passage retrieval using random walks with question-based priors , 2009, Inf. Process. Manag..

[8]  Jyrki Wallenius,et al.  Scholarly Communities of Research in Multiple Criteria Decision Making: a bibliometric Research Profiling Study , 2012, Int. J. Inf. Technol. Decis. Mak..

[9]  Christopher C. Yang,et al.  Hierarchical summarization of large documents , 2008 .

[10]  Mark T. Maybury,et al.  Advances in Automatic Text Summarization , 1999 .

[11]  Richard Tucker,et al.  Automatic summarising and the CLASP system , 2000 .

[12]  Xiaojun Wan Using only cross-document relationships for both generic and topic-focused multi-document summarizations , 2007, Information Retrieval.

[13]  Anna Kazantseva,et al.  Summarizing Short Stories , 2010, CL.

[14]  Hua Li,et al.  Document Summarization Using Conditional Random Fields , 2007, IJCAI.

[15]  M. M. Ali Differential evolution with generalized differentials , 2011, J. Comput. Appl. Math..

[16]  Zhengxin Chen,et al.  A Descriptive Framework for the Field of Data Mining and Knowledge Discovery , 2008, Int. J. Inf. Technol. Decis. Mak..

[17]  Yihong Gong,et al.  Multi-Document Summarization using Sentence-based Topic Models , 2009, ACL.

[18]  Rasim M. Alguliev,et al.  Automatic Text Documents Summarization through Sentences Clustering , 2008 .

[19]  Rasim M. Alguliyev,et al.  GenDocSum + MCLR: Generic document summarization based on maximum coverage and less redundancy , 2012, Expert Syst. Appl..

[20]  Wai Lam,et al.  Towards More Effective Text Summarization Based on Textual Association Networks , 2008, 2008 Fourth International Conference on Semantics, Knowledge and Grid.

[21]  Yuji Matsumoto,et al.  The diversity-based approach to open-domain text summarization , 2003, Inf. Process. Manag..

[22]  El-Sayed Atlam,et al.  A New Approach for Text Similarity Using Articles , 2008, Int. J. Inf. Technol. Decis. Mak..

[23]  Hoa Trang Dang,et al.  Overview of DUC 2005 , 2005 .

[24]  Ramiz M. Aliguliyev A Novel Partitioning-Based Clustering Method and Generic Document Summarization , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops.

[25]  Andries Petrus Engelbrecht,et al.  Binary Differential Evolution , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[26]  Jin Zhang,et al.  AdaSum: an adaptive model for summarization , 2008, CIKM '08.

[27]  Cuiping Wei,et al.  An Intuitionistic Fuzzy Group Decision-Making Approach Based on Entropy and Similarity Measures , 2011, Int. J. Inf. Technol. Decis. Mak..

[28]  Sadid A. Hasan,et al.  Query-focused multi-document summarization: automatic data annotations and supervised learning approaches , 2011, Natural Language Engineering.

[29]  Ilyas Cicekli,et al.  Generic text summarization for Turkish , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[30]  Ramiz M. Aliguliyev,et al.  CLUSTERING TECHNIQUES AND DISCRETE PARTICLE SWARM OPTIMIZATION ALGORITHM FOR MULTI‐DOCUMENT SUMMARIZATION , 2010, Comput. Intell..

[31]  LiTao,et al.  Integrating Document Clustering and Multidocument Summarization , 2011 .

[32]  Xuanjing Huang,et al.  Using query expansion in graph-based approach for query-focused multi-document summarization , 2009, Inf. Process. Manag..

[33]  Hiroya Takamura,et al.  Text summarization model based on the budgeted median problem , 2009, CIKM.

[34]  Mehmet Fatih Tasgetiren,et al.  Differential evolution algorithm with ensemble of parameters and mutation strategies , 2011, Appl. Soft Comput..

[35]  Wenjie Li,et al.  A spectral analysis approach to document summarization: Clustering and ranking sentences simultaneously , 2011, Inf. Sci..

[36]  Christopher C. Yang,et al.  Hierarchical summarization of large documents , 2008, J. Assoc. Inf. Sci. Technol..

[37]  Furu Wei,et al.  PNR2: Ranking Sentences with Positive and Negative Reinforcement for Query-Oriented Update Summarization , 2008, COLING.

[38]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[39]  Qingyu Zhang,et al.  Web Mining: a Survey of Current Research, Techniques, and Software , 2008, Int. J. Inf. Technol. Decis. Mak..

[40]  Rasim M. Alguliyev,et al.  Multiple documents summarization based on evolutionary optimization algorithm , 2013, Expert Syst. Appl..

[41]  D. Wolfe,et al.  Nonparametric Statistical Methods. , 1974 .

[42]  Ting Liu,et al.  A novel approach to update summarization using evolutionary manifold-ranking and spectral clustering , 2012, Expert Syst. Appl..

[43]  Qin Lu,et al.  Intertopic information mining for query-based summarization , 2010 .

[44]  Massih-Reza Amini,et al.  Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization , 2009, SIGIR.

[45]  Hiroya Takamura,et al.  Text Summarization Model based on Maximum Coverage Problem and its Variant , 2008 .

[46]  Ramiz M. Aliguliyev,et al.  A new sentence similarity measure and sentence based extractive technique for automatic text summarization , 2009, Expert Syst. Appl..

[47]  Xijin Tang,et al.  Distribution of Multi-Words in Chinese and English Documents , 2009, Int. J. Inf. Technol. Decis. Mak..

[48]  Dilek Z. Hakkani-Tür,et al.  A Hybrid Hierarchical Model for Multi-Document Summarization , 2010, ACL.

[49]  Gang Kou,et al.  Multiple factor hierarchical clustering algorithm for large scale web page and search engine clickstream data , 2012, Ann. Oper. Res..

[50]  Rasim M. Alguliyev,et al.  MCMR: Maximum coverage and minimum redundant text summarization model , 2011, Expert Syst. Appl..

[51]  Rasim M. Alguliyev,et al.  Formulation of document summarization as a 0-1 nonlinear programming problem , 2013, Comput. Ind. Eng..

[52]  Zongkai Yang,et al.  The Automated Estimation of Content-Terms for Query-Focused Multi-document Summarization , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[53]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[54]  Rasim M. Alguliyev,et al.  Sentence selection for generic document summarization using an adaptive differential evolution algorithm , 2011, Swarm Evol. Comput..

[55]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[56]  Fuji Ren,et al.  GA, MR, FFNN, PNN and GMM based models for automatic text summarization , 2009, Comput. Speech Lang..

[57]  Qin Lu,et al.  Applying regression models to query-focused multi-document summarization , 2011, Inf. Process. Manag..

[58]  Jie Tang,et al.  Multi-topic Based Query-Oriented Summarization , 2009, SDM.

[59]  Rasim M. Alguliyev,et al.  AN OPTIMIZATION APPROACH TO AUTOMATIC GENERIC DOCUMENT SUMMARIZATION , 2013, Comput. Intell..

[60]  Naomie Salim,et al.  MMI diversity based text summarization , 2009 .

[61]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[62]  Yihong Gong,et al.  Integrating Document Clustering and Multidocument Summarization , 2011, TKDD.

[63]  Sun Park,et al.  Automatic generic document summarization based on non-negative matrix factorization , 2009, Inf. Process. Manag..

[64]  Rasim M. Alguliyev,et al.  CDDS: Constraint-driven document summarization models , 2013, Expert Syst. Appl..

[65]  P. N. Suganthan,et al.  Differential Evolution: A Survey of the State-of-the-Art , 2011, IEEE Transactions on Evolutionary Computation.

[66]  Rasim M. Alguliyev,et al.  DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization , 2012, Knowl. Based Syst..

[67]  Philip H. Ramsey Nonparametric Statistical Methods , 1974, Technometrics.

[68]  Furu Wei,et al.  iRANK: A rank-learn-combine framework for unsupervised ensemble ranking , 2010 .

[69]  Leonhard Hennig,et al.  Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis , 2009, RANLP.

[70]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[71]  William B. Frakes,et al.  Stemming Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[72]  C. Borror Nonparametric Statistical Methods, 2nd, Ed. , 2001 .

[73]  Fuji Ren Automatic Abstracting Important Sentences , 2005, Int. J. Inf. Technol. Decis. Mak..

[74]  Karen Spärck Jones Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[75]  Xiaolei Wang,et al.  Personalized PageRank Based Multi-document Summarization , 2008, IEEE International Workshop on Semantic Computing and Systems.

[76]  Furu Wei,et al.  Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization , 2008, SIGIR '08.

[77]  Uma Shanker Tiwary,et al.  Utilizing Local Context for Effective Information Retrieval , 2008, Int. J. Inf. Technol. Decis. Mak..

[78]  Jin Zhang,et al.  GSPSummary: A Graph-Based Sub-topic Partition Algorithm for Summarization , 2008, AIRS.