Topic Detection and Tracking for News Web Pages

This paper proposes a new approach to observe, summarize and track events from a collection of news Web pages. Given a set of temporal Web pages, we obtain valid times-tamp from Web pages and detect events by means of clustering. Then we track events by using KeyGraph based on the clusters and abstract the clusters by using SuffixTree. We examine some experimental results and show the usefulness of our approach

[1]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[2]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[3]  Yukio Ohsawa,et al.  KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[4]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[5]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[6]  Weiguo Fan,et al.  Automatic summarization of search engine hit lists , 2000 .

[7]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[8]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.