Overview of the CL-SciSumm 2016 Shared Task

The CL-SciSumm 2016 Shared Task is the first medium-scale shared task on scientific document summarization in the computational linguistics (CL) domain. The task built off of the experience and training data set created in its namesake pilot task, which was conducted in 2014 by the same organizing committee. The track included three tasks involving: (1A) identifying relationships between citing documents and the referred document, (1B) classifying the discourse facets, and (2) generating the abstractive summary. The dataset comprised 30 annotated sets of citing and reference papers from the open access research papers in the CL domain. This overview paper describes the participation and the official results of the second CL-SciSumm Shared Task, organized as a part of the Joint Workshop onBibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016), held in New Jersey,USA in June, 2016. The annotated dataset used for this shared task and the scripts used for evaluation can be accessed and used by the community at: https://github.com/WING-NUS/scisumm-corpus.

[1]  Marc Moens,et al.  Articles Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status , 2002, CL.

[2]  Marti A. Hearst,et al.  Citances: Citation Sentences for Semantic Analysis of Bioscience Text , 2004 .

[3]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[4]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[5]  Karen Spärck Jones Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[6]  Dragomir R. Radev,et al.  Scientific Paper Summarization Using Citation Summary Networks , 2008, COLING.

[7]  Horacio Saggion,et al.  SUMMA. A Robust and Adaptable Summarization Tool , 2008, TAL.

[8]  Horacio Saggion A Robust and Adaptable Summarization Tool , 2008 .

[9]  Dragomir R. Radev,et al.  Using Citations to Generate surveys of Scientific Paradigms , 2009, NAACL.

[10]  Patrick Drouin Extracting a bilingual transdisciplinary scientific lexicon , 2010 .

[11]  Min-Yen Kan,et al.  Towards Automated Related Work Summarization , 2010, COLING.

[12]  Sylviane Granger,et al.  ELexicography in the 21st century: new challenges, new applications , 2010 .

[13]  Christopher S. G. Khoo,et al.  Deconstructing Human Literature Reviews – A Framework for Multi-Document Summarization , 2013, ENLG.

[14]  Dragomir R. Radev,et al.  The computational linguistics summarization pilot task , 2014 .

[15]  John M. Conroy,et al.  Vector Space and Language Models for Scientific Document Summarization , 2015 .

[16]  Rakesh M. Verma,et al.  University of Houston at CL-SciSumm 2016: SVMs with tree kernels and Sentence Similarity , 2016, BIRNDL@JCDL.

[17]  Dapeng Wu,et al.  PolyU at CL-SciSumm 2016 , 2016, BIRNDL@JCDL.

[18]  Richa Sharma,et al.  Lexical and Syntactic cues to identify Reference Scope of Citance , 2016, BIRNDL@JCDL.

[19]  Jian Xu,et al.  Recognizing Reference Spans and Classifying their Discourse Facets , 2016, BIRNDL@JCDL.

[20]  Horacio Saggion,et al.  Trainable Citation-enhanced Summarization of Scientific Articles , 2016, BIRNDL@JCDL.

[21]  Lei Li,et al.  CIST System for CL-SciSumm 2016 Shared Task , 2016, BIRNDL@JCDL.

[22]  Dietmar Wolfram,et al.  Editorial for the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL) at JCDL 2016 , 2016, BIRNDL@JCDL.

[23]  Tadashi Nomoto NEAL: A Neurally Enhanced Approach to Linking Citation and Reference , 2016, BIRNDL@JCDL.

[24]  Guy Lapalme,et al.  RALI System Description for CL-SciSumm 2016 Shared Task , 2016, BIRNDL@JCDL.

[25]  Roman Kern,et al.  Identifying Referenced Text in Scientific Publications by Summarisation and Classification Techniques , 2016, BIRNDL@JCDL.

[26]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.