Learning approaches for detecting and tracking news events

The authors extend existing supervised-learning and unsupervised-clustering algorithms to allow document classification based on the information content and temporal aspects of news events. They've adapted several IR and machine learning techniques for effective event detection and tracking. The article discusses our research using manually segmented documents.

[1]  Ellen M. Voorhees,et al.  Implementing agglomerative hierarchic clustering algorithms for use in document retrieval , 1986, Inf. Process. Manag..

[2]  Peter Willett,et al.  Recent trends in hierarchic document clustering: A critical review , 1988, Inf. Process. Manag..

[3]  Tomás Feder,et al.  Optimal algorithms for approximate clustering , 1988, STOC '88.

[4]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[5]  W. Bruce Croft,et al.  Support for Browsing in an Intelligent Text Retrieval System , 1989, Int. J. Man Mach. Stud..

[6]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[7]  Norbert Fuhr,et al.  AIR/X - A rule-based multistage indexing system for Iarge subject fields , 1991, RIAO.

[8]  David L. Waltz,et al.  Classifying news stories using memory based reasoning , 1992, SIGIR '92.

[9]  David L. Waltz,et al.  Trading MIPS and memory for knowledge engineering , 1992, CACM.

[10]  David R. Karger,et al.  Scatter/Gather: a cluster-based approach to browsing large document collections , 1992, SIGIR '92.

[11]  Sholom M. Weiss,et al.  Towards language independent automated learning of text categorization models , 1994, SIGIR '94.

[12]  Yiming Yang,et al.  Expert network: effective and efficient learning from human decisions in text categorization and retrieval , 1994, SIGIR '94.

[13]  Donna Harman,et al.  The Second Text Retrieval Conference (TREC-2) , 1995, Inf. Process. Manag..

[14]  Takenobu Tokunaga,et al.  Cluster-based text categorization: a comparison of category search strategies , 1995, SIGIR '95.

[15]  Andreas S. Weigend,et al.  A neural network approach to topic spotting , 1995 .

[16]  James P. Callan,et al.  Document filtering with inference networks , 1996, SIGIR '96.

[17]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[18]  Marti A. Hearst,et al.  Reexamining the cluster hypothesis: scatter/gather on retrieval results , 1996, SIGIR '96.

[19]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[20]  Wai Lam,et al.  Using a generalized instance set for automatic text categorization , 1998, SIGIR '98.

[21]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[22]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[23]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[24]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[25]  R. Papka,et al.  On-line new event detection and tracking , 1998, SIGIR '98.

[26]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[27]  Jaime G. Carbonell,et al.  The Use of MMR and Diversity-Based Reranking in Document Reranking and Summarization , 1998 .

[28]  David E. Johnson,et al.  Maximizing Text-Mining Performance , 1999 .

[29]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.