Automatic Pyramid Evaluation Exploiting EDU-based Extractive Reference Summaries

This paper tackles automation of the pyramid method, a reliable manual evaluation framework. To construct a pyramid, we transform human-made reference summaries into extractive reference summaries that consist of Elementary Discourse Units (EDUs) obtained from source documents and then weight every EDU by counting the number of extractive reference summaries that contain the EDU. A summary is scored by the correspondences between EDUs in the summary and those in the pyramid. Experiments on DUC and TAC data sets show that our methods strongly correlate with various manual evaluations.

[1]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[2]  Junyi Jessy Li,et al.  The Role of Discourse Units in Near-Extractive Summarization , 2016, SIGDIAL Conference.

[3]  Weiwei Guo,et al.  Automated Pyramid Scoring of Summaries using Distributional Semantics , 2013, ACL.

[4]  Dragomir R. Radev,et al.  Summarization evaluation using relative utility , 2003, CIKM '03.

[5]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[6]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[7]  Qian Yang,et al.  PEAK: Pyramid Evaluation via Automated Knowledge Extraction , 2016, AAAI.

[8]  Roberto Navigli,et al.  Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity , 2013, ACL.

[9]  Hai Zhao,et al.  Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network , 2015, ArXiv.

[10]  Graeme Hirst,et al.  Two-pass Discourse Segmentation with Pairing and Global Features , 2014, ArXiv.

[11]  Hiroya Takamura,et al.  Subtree Extractive Summarization via Submodular Maximization , 2013, ACL.

[12]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[13]  Ani Nenkova,et al.  Automation of Summary Evaluation by the Pyramid Method , 2005 .

[14]  Jun Suzuki,et al.  Enumeration of Extractive Oracle Summaries , 2017, EACL.

[15]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[16]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[17]  Luciano Del Corro,et al.  ClausIE: clause-based open information extraction , 2013, WWW.

[18]  Ani Nenkova,et al.  The Pyramid Method: Incorporating human content selection variation in summarization evaluation , 2007, TSLP.

[19]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[20]  Jun-ichi Fukumoto,et al.  Automated Summarization Evaluation with Basic Elements. , 2006, LREC.

[21]  Brian Roark,et al.  The utility of parse-derived features for automatic discourse segmentation , 2007, ACL.