Topic Detection and Tracking Pilot Study Final Report

Topic Detection and Tracking (TDT) is a DARPA-sponsored initiative to investigate the state of the art in finding and following new events in a stream of broadcast news stories. The TDT problem consists of three major tasks: (1) segmenting a stream of data, especially recognized speech, into distinct stories; (2) identifying those news stories that are the first to discuss a new event occurring in the news; and (3) given a small number of sample news stories about an event, finding all following stories in the stream.

[1]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[2]  W. Bruce Croft,et al.  Using Probabilistic Models of Document Retrieval without Relevance Information , 1979, J. Documentation.

[3]  Ellen M. Voorhees,et al.  Implementing agglomerative hierarchic clustering algorithms for use in document retrieval , 1986, Inf. Process. Manag..

[4]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[5]  Tomás Feder,et al.  Optimal algorithms for approximate clustering , 1988, STOC '88.

[6]  Gerald Salton,et al.  Automatic text processing , 1988 .

[7]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[8]  David R. Karger,et al.  Scatter/Gather: a cluster-based approach to browsing large document collections , 1992, SIGIR '92.

[9]  Hideki Kozima,et al.  Text Segmentation Based on Similarity between Words , 1993, ACL.

[10]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[11]  Shmuel T. Klein,et al.  Detecting Content-Bearing Words by Serial Clustering. , 1995, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[12]  Rebecca J. Passonneau,et al.  Combining Multiple Knowledge Sources for Discourse Segmentation , 1995, ACL.

[13]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[14]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[15]  W. Bruce Croft,et al.  Text Segmentation by Topic , 1997, ECDL.

[16]  John D. Lafferty,et al.  A Model of Lexical Attraction and Repulsion , 1997, ACL.

[17]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..