论文信息 - A Corpus of Very Short Scientific Summaries

A Corpus of Very Short Scientific Summaries

We present a new summarisation task, taking scientific articles and producing journal table-of-contents entries in the chemistry domain. These are one- or two-sentence author-written summaries that present the key findings of a paper. This is a first look at this summarisation task with an open access publication corpus consisting of titles and abstracts, as input texts, and short author-written advertising blurbs, as the ground truth. We introduce the dataset and evaluate it with state-of-the-art summarisation methods.

[1] Rada Mihalcea,et al. TextRank: Bringing Order into Text , 2004, EMNLP.

[2] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3] Mirella Lapata,et al. Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[4] Juan-Manuel Torres-Moreno,et al. An Efficient Statistical Approach for Automatic Organic Chemistry Summarization , 2008, GoTAL.

[5] Emma J. Chory,et al. A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.

[6] Mirella Lapata,et al. Text Summarization with Pretrained Encoders , 2019, EMNLP.

[7] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[8] Arzucan Özgür,et al. Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery , 2020, Drug discovery today.

[9] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[10] Iz Beltagy,et al. SciBERT: A Pretrained Language Model for Scientific Text , 2019, EMNLP.

[11] Ani Nenkova,et al. Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion , 2007, Information Processing & Management.

[12] Peter Murray-Rust,et al. ChemicalTagger: A tool for semantic text-mining in chemistry , 2011, J. Cheminformatics.

[13] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[14] Alexander M. Rush,et al. Bottom-Up Abstractive Summarization , 2018, EMNLP.

[15] Emma J. Chory,et al. A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.

[16] Lu Wang,et al. BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization , 2019, ACL.

[17] C. Lee Giles,et al. Identifying, Indexing, and Ranking Chemical Formulae and Chemical Names in Digital Documents , 2011, TOIS.

[18] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[19] Xuanjing Huang,et al. Pre-trained Models for Natural Language Processing: A Survey , 2020, ArXiv.

[20] John Boyle,et al. Chemlistem: chemical named entity recognition using recurrent neural networks , 2018, Journal of Cheminformatics.

[21] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[22] Mor Naaman,et al. Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[23] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[24] Yen-Chun Chen,et al. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[25] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26] Franck Dernoncourt,et al. A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents , 2018, NAACL.

[27] Antonio Zamora,et al. Automatic Abstracting Research at Chemical Abstracts Service , 1975, J. Chem. Inf. Comput. Sci..

[28] S. Chitrakala,et al. A survey on extractive text summarization , 2017, 2017 International Conference on Computer, Communication and Signal Processing (ICCCSP).