Towards Monitoring of Novel Statements in the News

In media monitoring users have a clearly defined information need to find so far unknown statements regarding certain entities or relations mentioned in natural-language text. However, commonly used keyword-based search technologies are focused on finding relevant documents and cannot judge the novelty of statements contained in the text. In this work, we propose a new semantic novelty measure that allows to retrieve statements, which are both novel and relevant, from natural-language sentences in news articles. Relevance is defined by a semantic query of the user, while novelty is ensured by checking whether the extracted statements are related, but non-existing in a knowledge base containing the currently known facts. Our evaluation performed on English news texts and on CrunchBase as the knowledge base demonstrates the effectiveness, unique capabilities and future challenges of this novel approach to novelty.

[1]  M. Trampus,et al.  INTERNALS OF AN AGGREGATED WEB NEWS FEED , 2012 .

[2]  Aldo Gangemi,et al.  Knowledge Extraction Based on Discourse Representation Theory and Linguistic Frames , 2012, EKAW.

[3]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.

[4]  Yi Zhang,et al.  Novelty and redundancy detection in adaptive filtering , 2002, SIGIR '02.

[5]  W. Bruce Croft,et al.  Novelty detection based on sentence level patterns , 2005, CIKM '05.

[6]  James Fan,et al.  Large Scale Relation Detection , 2010, HLT-NAACL 2010.

[7]  Isabelle Augenstein,et al.  LODifier: Generating Linked Data from Unstructured Text , 2012, ESWC.

[8]  Michalis Vazirgiannis,et al.  Efficient Online Novelty Detection in News Streams , 2013, WISE.

[9]  Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning - Proceedings of the Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes, EMNLP-CoNLL 2012, July 13, 2012, Jeju Island, Korea , 2012, EMNLP-CoNLL Shared Task.

[10]  Luciano Del Corro,et al.  ClausIE: clause-based open information extraction , 2013, WWW.

[11]  Donna K. Harman,et al.  Novelty Detection: The TREC Experience , 2005, HLT.

[12]  Achim Rettinger,et al.  xLiD-Lexica: Cross-lingual Linked Data Lexica , 2014, LREC.

[13]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[14]  Susan T. Dumais,et al.  Newsjunkie: providing personalized newsfeeds via analysis of information novelty , 2004, WWW '04.

[15]  D. Gerber,et al.  Bootstrapping the Linked Data Web , 2011 .

[16]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.

[17]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[18]  W. Bruce Croft,et al.  An information-pattern-based approach to novelty detection , 2008, Inf. Process. Manag..

[19]  Achim Rettinger,et al.  X-LiSA: Cross-lingual Semantic Annotation , 2014, Proc. VLDB Endow..

[20]  André Freitas,et al.  Graphia: Extracting Contextual Relation Graphs from Text , 2013, ESWC.

[21]  Heiner Stuckenschmidt,et al.  Semantifying Triples from Open Information Extraction Systems , 2014, STAIRS.