A genetic graph-based clustering approach to biomedical summarization

Summarization techniques have become increasingly important over the last few years, specially in biomedical research, where information overload is major problem. Researchers of this area need a shorter version of the texts which contains all the important information while discarding irrelevant one. There are several applications which deal with this problem, however, these applications are sometimes less informative than the user needs. This work deals with this problem trying to improve a summarization graph-based process using genetic clustering techniques. Our automatic summaries are compared to those produced by several commercial and research summarizers, and demonstrate the appropriateness of using genetic techniques in automatic summarization.

[1]  Pablo Gervás,et al.  A semantic graph-based approach to biomedical summarisation , 2011, Artif. Intell. Medicine.

[2]  Matthias Dehmer Structural Analysis of Complex Networks , 2010 .

[3]  Eduard H. Hovy,et al.  Summarization Evaluation Using Transformed Basic Elements , 2008, TAC.

[4]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[5]  Xiaohua Hu,et al.  A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method , 2007, BMC Bioinformatics.

[6]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[7]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[8]  K. Bretonnel Cohen,et al.  Frontiers of biomedical text mining: current progress , 2007, Briefings Bioinform..

[9]  Massimo Marchiori,et al.  Method to find community structures based on information centrality. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Inderjeet Mani,et al.  Summarization Evaluation: An Overview , 2001, NTCIR.

[11]  Yang Wang,et al.  Question Answering Summarization of Multiple Biomedical Documents , 2007, Canadian Conference on AI.

[12]  Halil Kilicoglu,et al.  Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment , 2006, J. Assoc. Inf. Sci. Technol..

[13]  Jie Wu,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2003 .

[14]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[15]  David Coley,et al.  Introduction to Genetic Algorithms for Scientists and Engineers , 1999 .

[16]  Mark Last,et al.  Graph-Based Keyword Extraction for Single-Document Summarization , 2008, COLING 2008.

[17]  Manisha Mantri,et al.  Unified Medical Language System , 2013 .

[18]  Ron Shamir,et al.  A clustering algorithm based on graph connectivity , 2000, Inf. Process. Lett..

[19]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A graph clustering algorithm based on a clustering coefficient for weighted graphs , 2011, Journal of the Brazilian Computer Society.

[20]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[21]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[22]  Ani Nenkova,et al.  Revisiting Readability: A Unified Framework for Predicting Text Quality , 2008, EMNLP.

[23]  Karen Sparck Jones,et al.  Book Reviews: Evaluating Natural Language Processing Systems: An Analysis and Review , 1996, CL.

[24]  Wai Lam,et al.  Evaluation Challenges in Large-Scale Document Summarization , 2003, ACL.

[25]  Carolyn M. Hall,et al.  Encyclopedia of Library and Information Science , 1971 .

[26]  Lisa F. Rau,et al.  Automatic Condensation of Electronic Publications by Sentence Selection , 1995, Inf. Process. Manag..

[27]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[28]  Hongfei Lin,et al.  Enhancing Biomedical Text Summarization Using Semantic Relation Extraction , 2011, PloS one.

[29]  F B ROGERS,et al.  Medical Subject Headings , 1948, Nature.

[30]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[31]  Chin-Yew Lin,et al.  Looking for a Few Good Metrics: Automatic Summarization Evaluation - How Many Samples Are Enough? , 2004, NTCIR.

[32]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[33]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[34]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[35]  David Camacho,et al.  A Genetic Graph-Based Clustering Algorithm , 2012, IDEAL.

[36]  Seymour Geisser,et al.  8. Predictive Inference: An Introduction , 1995 .

[37]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[38]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[39]  Hyoil Han,et al.  The use of domain-specific concepts in biomedical text summarization , 2007, Inf. Process. Manag..

[40]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..