T-Scroll: Visualizing Trends in a Time-Series of Documents for Interactive User Exploration

On the Internet, a large number of documents such as news articles and online journals are delivered everyday. We often have to review major topics and topic transitions from a large time-series of documents, but it requires much time and effort to browse and analyze the target documents. We have therefore developed an information visualization system called T-Scroll (Trend/Topic-Scroll) to visualize the transition of topics extracted from those documents. The system takes periodical outputs of the underlying clustering system for a time-series of documents then visualizes the relationships between clusters as a scroll. Using its interaction facility, users can grasp the topic transitions and the details of topics for the target time period. This paper describes the idea, the functions, the implementation, and the evaluation of the T-Scroll system.

[1]  Lucy T. Nowell,et al.  ThemeRiver: Visualizing Thematic Changes in Large Document Collections , 2002, IEEE Trans. Vis. Comput. Graph..

[2]  Hiroyuki Kitagawa,et al.  Novelty-based Incremental Document Clustering for On-line Documents , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[3]  Hiroyuki Kitagawa,et al.  A Novelty-based Clustering Method for On-line Documents , 2008, World Wide Web.

[4]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[5]  James Allan,et al.  Automatic generation of overview timelines , 2000, SIGIR '00.

[6]  Hiroyuki Kitagawa,et al.  An On-Line Document Clustering Method Based on Forgetting Factors , 2001, ECDL.

[7]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[8]  Michael W. Berry,et al.  Survey of Text Mining: Clustering, Classification, and Retrieval , 2007 .

[9]  W. Muller,et al.  Visualization methods for time-dependent data - an overview , 2003, Proceedings of the 2003 Winter Simulation Conference, 2003..

[10]  Heidrun Schumann,et al.  Visualization for modeling and simulation: visualization methods for time-dependent data - an overview , 2003, WSC '03.

[11]  Leo Egghe,et al.  Introduction to Informetrics: Quantitative Methods in Library, Documentation and Information Science , 1990 .

[12]  Myra Spiliopoulou,et al.  MONIC: modeling and monitoring cluster transitions , 2006, KDD '06.

[13]  William M. Pottenger,et al.  A Survey of Emerging Trend Detection in Textual Data Mining , 2004 .

[14]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .