Towards Evaluating the Impact of Anaphora Resolution on Text Summarisation from a Human Perspective

Automatic Text Summarisation (TS) is the process of abstracting key content from information sources. Previous research attempted to combine diverse NLP techniques to improve the quality of the produced summaries. The study reported in this paper seeks to establish whether Anaphora Resolution (AR) can improve the quality of generated summaries, and to assess whether AR has the same impact on text from different subject domains. Summarisation evaluation is critical to the development of automatic summarisation systems. Previous studies have evaluated their summaries using automatic techniques. However, automatic techniques lack the ability to evaluate certain factors which are better quantified by human beings. In this paper the summaries are evaluated via human judgment, where the following factors are taken into consideration: informativeness, readability and understandability, conciseness, and the overall quality of the summary. Overall, the results of this study depict a pattern of slight but not significant increases in the quality of summaries produced using AR. At a subject domain level, however, the results demonstrate that the contribution of AR towards TS is domain dependent and for some domains it has a statistically significant impact on TS.

[1]  Heeyoung Lee,et al.  Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task , 2011, CoNLL Shared Task.

[2]  Séamus Lawless,et al.  OntoSeg: A Novel Approach to Text Segmentation Using Ontological Similarity , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[3]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[4]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[5]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[6]  Jean Carletta,et al.  Extractive summarization of meeting recordings , 2005, INTERSPEECH.

[7]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[8]  Elena Lloret,et al.  A Comparative Study of the Impact of Statistical and Semantic Features in the Framework of Extractive Text Summarization , 2012, TSD.

[9]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[10]  Elena Lloret,et al.  Extractive Text Summarization: Can We Use the Same Techniques for Any Text? , 2013, NLDB.

[11]  Karel Jezek,et al.  Evaluation Measures for Text Summarization , 2012, Comput. Informatics.

[12]  Julia Galliers,et al.  Evaluating natural language processing systems , 1995 .

[13]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[14]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[15]  Karel Jezek,et al.  Two uses of anaphora resolution in summarization , 2007, Inf. Process. Manag..

[16]  Ani Nenkova,et al.  Structural Features for Predicting the Linguistic Quality of Text - Applications to Machine Translation, Automatic Summarization and Human-Authored Text , 2009, Empirical Methods in Natural Language Generation.

[17]  Peter Lavin,et al.  Text Summarization and Speech Synthesis for the Automated Generation of Personalized Audio Presentations , 2015, NLDB.

[18]  Horacio Saggion,et al.  Concept Identification and Presentation in the Context of Technical Text Summarization , 2000 .

[19]  Halil Kilicoglu,et al.  Abstraction Summarization for Managing the Biomedical Research Literature , 2004, HLT-NAACL 2004.

[20]  Fermín L. Cruz,et al.  Supervised TextRank , 2006, FinTAL.

[21]  Karen Sparck Jones,et al.  Book Reviews: Evaluating Natural Language Processing Systems: An Analysis and Review , 1996, CL.

[22]  Richard Evans,et al.  Coreference Resolution: To What Extent Does It Help NLP Applications? , 2012, TSD.