Generating Coherent Summaries of Scientific Articles Using Coherence Patterns

Previous work on automatic summarization does not thoroughly consider coherence while generating the summary. We introduce a graph-based approach to summarize scientific articles. We employ coherence patterns to ensure that the generated summaries are coherent. The novelty of our model is twofold: we mine coherence patterns in a corpus of abstracts, and we propose a method to combine coherence, importance and non-redundancy to generate the summary. We optimize these factors simultaneously using Mixed Integer Programming. Our approach significantly outperforms baseline and state-of-the-art systems in terms of coherence (summary coherence assessment) and relevance (ROUGE scores).

[1]  Xiaojun Wan,et al.  Exploiting neighborhood knowledge for single document summarization and keyphrase extraction , 2010, TOIS.

[2]  Dietrich Rebholz-Schuhmann,et al.  A Discourse-Driven Content Model for Summarising Scientific Articles Evaluated in a Complex Question Answering Task , 2013, EMNLP.

[3]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[4]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[5]  Mirella Lapata,et al.  Automatic Generation of Story Highlights , 2010, ACL.

[6]  Lucy Vanderwende,et al.  Exploring Content Models for Multi-Document Summarization , 2009, NAACL.

[7]  Dilek Z. Hakkani-Tür,et al.  A Hybrid Hierarchical Model for Multi-Document Summarization , 2010, ACL.

[8]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[9]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[10]  Simone Teufel,et al.  Whose Idea Was This, and Why Does it Matter? Attributing Scientific Work to Citations , 2007, HLT-NAACL.

[11]  Dragomir R. Radev,et al.  Scientific Paper Summarization Using Citation Summary Networks , 2008, COLING.

[12]  M E Newman,et al.  Scientific collaboration networks. I. Network construction and fundamental results. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  F. Daneš,et al.  Papers on functional sentence perspective , 1974 .

[14]  Alexander Gelbukh,et al.  Comparing Commercial Tools and State-of-the-Art Methods for Generating Text Summaries , 2009, 2009 Eighth Mexican International Conference on Artificial Intelligence.

[15]  Micha Elsner,et al.  Extending the Entity Grid with Entity-Specific Features , 2011, ACL.

[16]  Ion Androutsopoulos,et al.  Extractive Multi-Document Summarization with Integer Linear Programming and Support Vector Regression , 2012, COLING.

[17]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[18]  Xiaojun Wan,et al.  Towards a Unified Approach to Simultaneous Single-Document and Multi-Document Summarizations , 2010, COLING.

[19]  Takaaki Hasegawa,et al.  Opinion Summarization with Integer Linear Programming Formulation for Sentence Extraction and Ordering , 2010, COLING.

[20]  Marc Moens,et al.  Articles Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status , 2002, CL.

[21]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[22]  Yvette Graham,et al.  Re-evaluating Automatic Summarization with BLEU and 192 Shades of ROUGE , 2015, EMNLP.

[23]  Camille Guinaudeau,et al.  Graph-based Local Coherence Modeling , 2013, ACL.

[24]  Hoa Trang Dang,et al.  Overview of DUC 2005 , 2005 .

[25]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[26]  Sebastian Martschat,et al.  Multigraph Clustering for Unsupervised Coreference Resolution , 2013, ACL.

[27]  Michael Strube,et al.  Graph-based Coherence Modeling For Assessing Readability , 2015, *SEMEVAL.

[28]  Dragomir R. Radev,et al.  Surveyor: A System for Generating Coherent Survey Articles for Scientific Topics , 2015, AAAI.

[29]  Dragomir R. Radev,et al.  Coherent Citation-Based Summarization of Scientific Papers , 2011, ACL.

[30]  Dragomir R. Radev,et al.  Using Citations to Generate surveys of Scientific Paradigms , 2009, NAACL.

[31]  Dilek Z. Hakkani-Tür,et al.  A global optimization framework for meeting summarization , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[33]  Wai Lam,et al.  MEAD - A Platform for Multidocument Multilingual Text Summarization , 2004, LREC.

[34]  Mirella Lapata,et al.  Movie Script Summarization as Graph-based Scene Extraction , 2015, NAACL.

[35]  Dragomir R. Radev,et al.  Blind men and elephants: What do citation summaries tell us about a research article? , 2008, J. Assoc. Inf. Sci. Technol..

[36]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[37]  Daraksha Parveen,et al.  Integrating Importance, Non-Redundancy and Coherence in Graph-Based Extractive Summarization , 2015, IJCAI.

[38]  Richard J. Fitzgerald,et al.  Scientific collaboration networks , 2018 .

[39]  Xiaoyan Zhu,et al.  A Comparative Study on Ranking and Selection Strategies for Multi-Document Summarization , 2010, COLING.

[40]  Masaaki Nagata,et al.  Single-Document Summarization as a Tree Knapsack Problem , 2013, EMNLP.

[41]  Manabu Okumura,et al.  Producing More Readable Extracts by Revising Them , 1999, COLING.

[42]  Sébastien Adam,et al.  GEM++: A Tool for Solving Substitution-Tolerant Subgraph Isomorphism , 2015, GbRPR.

[43]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[44]  Daraksha Parveen,et al.  Topical Coherence for Graph-based Extractive Summarization , 2015, EMNLP.