Storybase: Towards Building a Knowledge Base for News Events

To better organize and understand online news information, we propose Storybase 1 , a knowledge base for news events that builds upon Wikipedia current events and daily Web news. It first constructs stories and their timelines based on Wikipedia current events and then detects and links daily news to enrich those Wikipedia stories with more comprehensive events. We encode events and develop efficient event clustering and chaining techniques in an event space. We demonstrate Storybase with a news events search engine that helps find historical and ongoing news stories and inspect their dynamic timelines.

[1]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[2]  Gerhard Weikum,et al.  EVIN: building a knowledge base of events , 2014, WWW.

[3]  Miles Osborne,et al.  Using paraphrases for improving first story detection in news and Twitter , 2012, HLT-NAACL.

[4]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[5]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[6]  Laurent Amsaleg,et al.  Locality sensitive hashing: A comparison of hash function types and querying mechanisms , 2010, Pattern Recognit. Lett..

[7]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[8]  Ashwin Lall,et al.  Online Generation of Locality Sensitive Hash Signatures , 2010, ACL.

[9]  Xavier Tannier,et al.  Building Event Threads out of Multiple News Articles , 2013, EMNLP.

[10]  Alexander J. Smola,et al.  Unified analysis of streaming news , 2011, WWW.

[11]  Jiwei Li,et al.  Evolutionary Hierarchical Dirichlet Process for Timeline Summarization , 2013, ACL.

[12]  James Allan,et al.  Introduction to topic detection and tracking , 2002 .

[13]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 2005, The VLDB Journal.

[14]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[15]  Philip S. Yu,et al.  Parameter Free Bursty Events Detection in Text Streams , 2005, VLDB.

[16]  Ee-Peng Lim,et al.  Analyzing feature trajectories for event detection , 2007, SIGIR.

[17]  Brendan T. O'Connor,et al.  Learning to Extract International Relations from Political Context , 2013, ACL.

[18]  Heng Ji,et al.  Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[19]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Event Chains , 2008, ACL.

[20]  Yan Zhang,et al.  Timeline Generation through Evolutionary Trans-Temporal Summarization , 2011, EMNLP.

[21]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[22]  Dafna Shahaf,et al.  Connecting the dots between news articles , 2010, IJCAI.

[23]  Yiming Yang,et al.  Learning approaches for detecting and tracking news events , 1999, IEEE Intell. Syst..

[24]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.