Social Web Observatory: A Platform and Method for Gathering Knowledge on Entities from Different Textual Sources

Within this work we describe a framework for the collection and summarization of information from the Web in an entity-driven manner. The framework consists of a set of appropriate workflows and the Social Web Observatory platform, which implements those workflows, supporting them through a language analysis pipeline. The pipeline includes text collection/crawling, identification of different entities, clustering of texts into events related to entities, entity-centric sentiment analysis, but also text analytics and visualization functionalities. The latter allow the user to take advantage of the gathered information as actionable knowledge: to understand the dynamics of the public opinion for a given entity over time and across real-world events. We describe the platform and the analysis functionality and evaluate the performance of the system, by allowing human users to score how the system fares in its intended purpose of summarizing entity-centered information from different sources in the Web.

[1]  Iraklis Varlamis,et al.  A Graph-based Text Similarity Measure That Employs Named Entity Information , 2017, RANLP.

[2]  George Giannakopoulos,et al.  N-gram Graphs: Representing Documents and Document Sets in Summary System Evaluation , 2009, TAC.

[3]  Marina Litvak,et al.  What’s up on Twitter? Catch up with TWIST! , 2016, COLING.

[4]  Mehmet A. Orgun,et al.  A survey on real-time event detection from the Twitter data stream , 2018, J. Inf. Sci..

[5]  George Giannakopoulos,et al.  NewSum: “N-Gram Graph”-Based Summarization in the Real World , 2014 .

[6]  Georgios Petasis,et al.  Introducing Sentiment Analysis for the Evaluation of Library’s Services Effectiveness , 2019 .

[7]  Manfred Klenner,et al.  Robust Compositional Polarity Classification , 2009, RANLP.

[8]  Soto Montalvo,et al.  Exploiting named entities for bilingual news clustering , 2015, J. Assoc. Inf. Sci. Technol..

[9]  Bin Ma,et al.  Using Cross-Entity Inference to Improve Event Extraction , 2011, ACL.

[10]  Xiaocheng Feng,et al.  A language-independent neural network for event detection , 2016, Science China Information Sciences.

[11]  Sasha Blair-Goldensohn,et al.  Sentiment Summarization: Evaluating and Learning User Preferences , 2009, EACL.

[12]  Iraklis Varlamis,et al.  Document clustering as a record linkage problem , 2018, DocEng.

[13]  Christopher D. Manning,et al.  Exploring Sentiment Summarization , 2004 .

[14]  Georgios Paliouras,et al.  Ellogon: A New Text Engineering Platform , 2002, LREC.

[15]  Georgios Petasis,et al.  Social Web Observatory: An entity-driven, holistic information summarization platform across sources , 2019 .

[16]  Wael Hassan Gomaa,et al.  A Survey of Text Similarity Approaches , 2013 .

[17]  Fernando Diaz,et al.  Processing Social Media Messages in Mass Emergency: Survey Summary , 2018, WWW.

[18]  Ryoji Kataoka,et al.  A search result clustering method using informatively named entities , 2005, WIDM '05.

[19]  Marko Grobelnik,et al.  Event registry: learning about world events from news , 2014, WWW.

[20]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[21]  Nikos Tsirakis,et al.  Sentiment Analysis for Reputation Management: Mining the Greek Web , 2014, SETN.