Supporting News Article Understanding by Detecting Subject-Background Event Relations

Typically, news articles mention not just one but multiple events. These events can be classified into subject or background events. The former are events that the article is written about, while the latter are additional events referred to in order to explain the background of the subject events (e.g., causal relations, circumstances or the consequences of the main event). Background events are considered to play an important role in helping to understand articles. In this paper, we first propose to classify content of news articles into subject or background event descriptions. In the second part of the paper, we demonstrate a novel solution for improving the news article search. Based on the subject and background relationship structure between events and articles, our method outputs news articles that help with understanding of a given target article.

[1]  Jack Gilliland,et al.  The concept of readability , 1968 .

[2]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[3]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[4]  Lijun Feng,et al.  Cognitively Motivated Features for Readability Assessment , 2009, EACL.

[5]  Ricardo Campos,et al.  Survey of Temporal Information Retrieval and Related Applications , 2014, ACM Comput. Surv..

[6]  Kevyn Collins-Thompson,et al.  Computational Assessment of Text Readability: A Survey of Current and Future Research Running title: Computational Assessment of Text Readability , 2014 .

[7]  Katsumi Tanaka,et al.  Time-based contextualized-news browser (t-cnb) , 2004, WWW Alt. '04.

[8]  J. Chall,et al.  Readability revisited : the new Dale-Chall readability formula , 1995 .

[9]  Claudia Niederée,et al.  Back to the Past: Supporting Interpretations of Forgotten Stories by Time-aware Re-Contextualization , 2015, WSDM.

[10]  Ani Nenkova,et al.  Revisiting Readability: A Unified Framework for Predicting Text Quality , 2008, EMNLP.

[11]  David A. Smith,et al.  Detecting and Browsing Events in Unstructured text , 2002, SIGIR '02.

[12]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Michael R. Lyu,et al.  A generalized Co-HITS algorithm and its application to bipartite graphs , 2009, KDD.

[15]  Steven Skiena,et al.  Watch the Story Unfold with TextWheel: Visualization of Large-Scale News Streams , 2012, TIST.

[16]  Dafna Shahaf,et al.  Connecting the dots between news articles , 2011, IJCAI 2011.

[17]  James Allan,et al.  Automatic generation of overview timelines , 2000, SIGIR '00.

[18]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[19]  Earl Rennison,et al.  Galaxy of news: an approach to visualizing and understanding expansive news landscapes , 1994, UIST '94.

[20]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[21]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[22]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.