Introduction to topic detection and tracking

The Topic Detection and Tracking (TDT) research program has been running for five years, starting with a pilot study and including yearly open and competitive evaluations since then. In this chapter we define the basic concepts of TDT and provide historical context for the concepts. In describing the various TDT evaluation tasks and workshops, we provide an overview of the technical approaches that have been used and that have succeeded.

[1]  Filippo Menczer,et al.  A cluster-based approach to tracking, detection and segmentation of broadcast news , 1999 .

[2]  W. Bruce Croft,et al.  Text Segmentation by Topic , 1997, ECDL.

[3]  James Allan,et al.  Automatic generation of overview timelines , 2000, SIGIR '00.

[4]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[5]  John D. Lafferty,et al.  Statistical Models for Text Segmentation , 1999, Machine Learning.

[6]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[7]  Richard M. Schwartz,et al.  Topic tracking for radio, TV broadcast, and newswire , 1999, EUROSPEECH.

[8]  James Allan,et al.  An Evaluation Corpus For Temporal Summarization , 2001, HLT.

[9]  Douglas W. Oard,et al.  Mandarin-English Information: Investigating Translingual Speech Retrieval , 2001, HLT.

[10]  Alexander A. Morgan,et al.  MITRE TDT-2000 SEGMENTATION SYSTEM , 2000 .

[11]  Yiming Yang,et al.  CMU Report on TDT-2: Segmentation, Detection and Tracking , 1999 .

[12]  Yiming Yang,et al.  Improving text categorization methods for event tracking , 2000, SIGIR '00.

[13]  Salim Roukos,et al.  Story Segmentation and Topic Detection in the Broadcast News Domain , 1999 .

[14]  James Allan,et al.  First story detection in TDT is hard , 2000, CIKM '00.

[15]  Mark Liberman,et al.  THE TDT-2 TEXT AND SPEECH CORPUS , 1999 .

[16]  Yoshimi Suzuki,et al.  Event tracking based on domain dependency , 2000, SIGIR '00.

[17]  James Allan,et al.  Temporal summaries of new topics , 2001, SIGIR '01.

[18]  James Allan,et al.  Extracting significant time varying features from text , 1999, CIKM '99.

[19]  Anne H. Anderson,et al.  Proceedings of Eurospeech , 2003, ISCA 2003.