Automatic generation of entity-oriented summaries for reputation management

Producing online reputation summaries for an entity (company, brand, etc.) is a focused summarization task with a distinctive feature: issues that may affect the reputation of the entity take priority in the summary. In this paper we (i) present a new test collection of manually created (abstractive and extractive) reputation reports which summarize tweet streams for 31 companies in the banking and automobile domains; (ii) propose a novel methodology to evaluate summaries in the context of online reputation monitoring, which profits from an analogy between reputation reports and the problem of diversity in search; and (iii) provide empirical evidence that producing reputation reports is different from a standard summarization problem, and incorporating priority signals is essential to address the task effectively.

[1]  Julio Gonzalo,et al.  A general evaluation measure for document organization tasks , 2013, SIGIR.

[2]  Mark Last,et al.  Graph-Based Keyword Extraction for Single-Document Summarization , 2008, COLING 2008.

[3]  Yitong Li,et al.  Graph-Based Multi-Tweet Summarization using Social Signals , 2012, COLING.

[4]  Jaime G. Carbonell,et al.  Extending a Single-Document Summarizer to Multi-Document: a Hierarchical Approach , 2015, *SEM@NAACL-HLT.

[5]  Jugal K. Kalita,et al.  Summarizing Microblogs Automatically , 2010, NAACL.

[6]  Halil Kilicoglu,et al.  Automatic summarization of MEDLINE citations for evidence-based medical treatment: A topic-oriented evaluation , 2009, J. Biomed. Informatics.

[7]  Hiroya Takamura,et al.  Summarizing a Document Stream , 2011, ECIR.

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[10]  Abraham Kandel,et al.  DegExt: a language-independent keyphrase extractor , 2013, J. Ambient Intell. Humaniz. Comput..

[11]  Pablo Gervás,et al.  SentiSense: An easily scalable concept-based affective lexicon for sentiment analysis , 2012, LREC.

[12]  Alistair Moffat,et al.  Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.

[13]  Jugal K. Kalita,et al.  Comparing Twitter Summarization Algorithms for Multiple Post Summaries , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[14]  Marina Litvak,et al.  Query-based summarization using MDL principle , 2017, MultiLing@EACL.

[15]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[16]  Khai Nguyen,et al.  TSGVi: a graph-based summarization system for Vietnamese documents , 2012, J. Ambient Intell. Humaniz. Comput..

[17]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[18]  Julio Gonzalo,et al.  Overview of RepLab 2013: Evaluating Online Reputation Monitoring Systems , 2013, CLEF.

[19]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[20]  Yogesh Kumar Meena,et al.  Feature Priority Based Sentence Filtering Method for Extractive Automatic Text Summarization , 2015 .

[21]  Khushbu Saraf,et al.  Improving graph based multidocument text summarization using an enhanced sentence similarity measure , 2015, 2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS).

[22]  Annie Louis,et al.  Summarization of Business-Related Tweets: A Concept-Based Approach , 2012, COLING.

[23]  Wai Lam,et al.  MEAD - A Platform for Multidocument Multilingual Text Summarization , 2004, LREC.

[24]  Jianwu Dang,et al.  Twitter summarization with social-temporal context , 2016, World Wide Web.

[25]  Deepayan Chakrabarti,et al.  Event Summarization Using Tweets , 2011, ICWSM.

[26]  Dragomir R. Radev,et al.  DivRank: the interplay of prestige and diversity in information networks , 2010, KDD.

[27]  Harry Shum,et al.  Twitter Topic Summarization by Ranking Tweets using Social Influence and Content Quality , 2012, COLING.

[28]  Omer F. Rana,et al.  Automatic Summarization of Real World Events Using Twitter , 2016, ICWSM.

[29]  Mimmo Parente,et al.  Time Aware Knowledge Extraction for microblog summarization on Twitter , 2015, Inf. Fusion.

[30]  Ludovic Bonnefoy,et al.  Towards the Improvement of Topic Priority Assignment Using Various Topic Detection Methods for E-reputation Monitoring on Twitter , 2014, NLDB.

[31]  Karl Aberer,et al.  Data Summarization with Social Contexts , 2016, CIKM.

[32]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[33]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[34]  Jorge Carrillo de Albornoz,et al.  Evaluating the use of different positional strategies for sentence selection in biomedical literature summarization , 2012, BMC Bioinformatics.

[35]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[36]  Dongwook Shin,et al.  Degree centrality for semantic abstraction summarization of therapeutic studies , 2011, J. Biomed. Informatics.

[37]  Xiaodong Gu,et al.  Aspect-based Opinion Summarization with Convolutional Neural Networks , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[38]  Vivi Nastase,et al.  Topic-Driven Multi-Document Summarization with Encyclopedic Knowledge and Spreading Activation , 2008, EMNLP.