Overview and Insights from the Shared Tasks at Scholarly Document Processing 2020: CL-SciSumm, LaySumm and LongSumm

We present the results of three Shared Tasks held at the Scholarly Document Processing Workshop at EMNLP2020: CL-SciSumm, LaySumm and LongSumm. We report on each of the tasks, which received 18 submissions in total, with some submissions addressing two or three of the tasks. In summary, the quality and quantity of the submissions show that there is ample interest in scholarly document summarization, and the state of the art in this domain is at a midway point between being an impossible task and one that is fully resolved.

[1]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[2]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[3]  Dragomir R. Radev,et al.  Overview and Results: CL-SciSumm Shared Task 2019 , 2019, BIRNDL@SIGIR.

[4]  Min-Yen Kan,et al.  Insights from CL-SciSumm 2016: the faceted scientific document summarization Shared Task , 2017, International Journal on Digital Libraries.

[5]  Alka Khurana,et al.  Divide and Conquer: From Complexity to Simplicity for Lay Summarization , 2020, SDP.

[6]  Eduard Hovy,et al.  Overview of the First Workshop on Scholarly Document Processing (SDP) , 2020, SDP.

[7]  Shirui Pan,et al.  SciSummPip: An Unsupervised Scientific Paper Summarization Pipeline , 2020, ArXiv.

[8]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[9]  Andreas Rauber,et al.  ARTU / TU Wien and Artificial Researcher@ LongSumm 20 , 2020, SDP.

[10]  Rong Huang,et al.  Team MLU@CL-SciSumm20: Methods for Computational Linguistics Scientific Citation Linkage , 2020, SDP@EMNLP.

[11]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[12]  Sriparna Saha,et al.  IITP-AI-NLP-ML@ CL-SciSumm 2020, CL-LaySumm 2020, LongSumm 2020 , 2020, SDP@EMNLP.

[13]  Stefan Lee,et al.  EvalAI: Towards Better Evaluation Systems for AI Agents , 2019, ArXiv.

[14]  Grigorios Tsoumakas,et al.  A Divide-and-Conquer Approach to the Summarization of Academic Articles , 2020, ArXiv.

[15]  Min-Yen Kan,et al.  The CL-SciSumm Shared Task 2018: Results and Key Insights , 2019, BIRNDL@SIGIR.

[16]  Mirella Lapata,et al.  Text Summarization with Pretrained Encoders , 2019, EMNLP.

[17]  David Konopnicki,et al.  A Study of Human Summaries of Scientific Articles , 2020, ArXiv.

[18]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[19]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[20]  Sriparna Saha,et al.  IIITBH-IITP@CL-SciSumm20, CL-LaySumm20, LongSumm20 , 2020, SDP@EMNLP.

[21]  Shirui Pan,et al.  Monash-Summ@LongSumm 20 SciSummPip: An Unsupervised Scientific Paper Summarization Pipeline , 2020, SDP.

[22]  Seungwon Kim,et al.  Using Pre-Trained Transformer for Better Lay Summarization , 2020, SDP.

[23]  Satya Almasian,et al.  UniHD@CL-SciSumm 2020: Citation Extraction as Search , 2020, SDP@EMNLP.

[24]  Nazli Goharian,et al.  GUIR @ LongSumm 2020: Learning to Generate Long Summaries from Scientific Documents , 2020, SDP.

[25]  Gholamreza Haffari,et al.  SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression , 2020, SIGIR.

[26]  Pascale Fung,et al.  Dimsum @LaySumm 20 , 2020, SDP@EMNLP.

[27]  Wei Liu,et al.  CIST@CL-SciSumm 2020, LongSumm 2020: Automatic Scientific Document Summarization , 2020, SDP.

[28]  Yao Zhao,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.

[29]  Jungo Kasai,et al.  ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks , 2019, AAAI.

[30]  Anja Fischer,et al.  1A-Team / Martin-Luther-Universität Halle-Wittenberg@CLSciSumm 20 , 2020, SDP.

[31]  David Konopnicki,et al.  A Summarization System for Scientific Documents , 2019, EMNLP.

[32]  Guy Lev,et al.  TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks , 2019, ACL.

[33]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[34]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[35]  Lu Wang,et al.  BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization , 2019, ACL.