Generating Breakpoint-based Timeline Overview for News Topic Retrospection

Though news readers can easily access a large number of news articles from the Internet, they can be overwhelmed by the quantity of information available, making it hard to get a concise, global picture of a news topic. In this paper we propose a novel method to address this problem. Given a set of articles for a given news topic, the proposed method models theme variation through time and identifies the breakpoints, which are time points when decisive changes occur. For each breakpoint, a brief summary is automatically constructed based on articles associated with the particular time point. Summaries are then ordered chronologically to form a timeline overview of the news topic. In this fashion, readers can easily track various news topics efficiently. We have conducted experiments on 15 popular topics in 2010. Empirical experiments show the effectiveness of our approach and its advantages over other approaches.

[1]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[2]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[3]  Yiming Yang,et al.  Learning approaches for detecting and tracking news events , 1999, IEEE Intell. Syst..

[4]  Michal Rosen-Zvi,et al.  Hidden Topic Markov Models , 2007, AISTATS.

[5]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[6]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[7]  Kai Zhang,et al.  Mining common topics from multiple asynchronous text streams , 2009, WSDM '09.

[8]  Murat Ali Bayir,et al.  Identifying breakpoints in public opinion , 2010, SOMA '10.

[9]  Xiaojun Wan TimedTextRank: adding the temporal dimension to multi-document summarization , 2007, SIGIR.

[10]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[11]  Richard Sproat,et al.  Mining named entities with temporally correlated bursts from multilingual web news streams , 2011, WSDM '11.

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  James Allan,et al.  Temporal summaries of new topics , 2001, SIGIR '01.

[14]  Fu-Ren Lin,et al.  Storyline-based summarization for news topic retrospection , 2008, Decis. Support Syst..

[15]  Yan Zhang,et al.  Timeline Generation through Evolutionary Trans-Temporal Summarization , 2011, EMNLP.

[16]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[17]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[18]  Ming-Syan Chen,et al.  An adaptive threshold framework for event detection using HMM-based life profiles , 2009, TOIS.

[19]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[20]  Richard Sproat,et al.  Mining correlated bursty topic patterns from coordinated text streams , 2007, KDD '07.

[21]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[22]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.