Information Filtering in TREC-9 and TDT-3: A Comparative Analysis

Much work on automated information filtering has been done in the TREC and TDT domains, but differences in corpora, the nature of TREC topics vs. TDT events, the constraints imposed on training and testing, and the choices of performance measures confound any meaningful comparison between these domains. We attempt to bridge the gap between them by evaluating the performance of the k-nearest-neighbor (kNN) classification system on the corpus and categories from one domain using the constraints of the other. To maximize comparability and understand the effect of the evaluation metrics specific to each domain, we optimize the performance of kNN separately for the F1, T9P (preferred metric for TREC-9) and Ctrk (official metric for TDT-3) metrics. Through a thorough comparison of our within-domain and cross-domain results, our results demonstrate that the corpus used for TREC-9 is more challenging for an information filtering system than the TDT-3 corpus and strongly suggest that the TDT-3 event tracking task itself is more difficult than the TREC batch filtering task. We also show that optimizing performance in TREC-9 and TDT-3 tends to result in systems with different performance characteristics, confounding any meaningful comparison between the two domains, and that T9P and Ctrk both have properties that make them undesirable as general information filtering metrics.

[1]  James Allan,et al.  Comparing Effectiveness in TDT and IR , 2000 .

[2]  Avi Arampatzis,et al.  KUN on the TREC-9 Filtering Track: Incrementality, Decay, and Threshold Optimization for Adaptive Fi , 2000 .

[3]  Stephen E. Robertson,et al.  Microsoft Cambridge at TREC-9: Filtering Track , 2000, TREC.

[4]  Jonathan Yamron,et al.  Statistical Models for Tracking and Detection , 2000 .

[5]  Yiming Yang,et al.  Improving text categorization methods for event tracking , 2000, SIGIR '00.

[6]  Stephen E. Robertson,et al.  Microsoft Cambridge at TREC 2002: Filtering Track , 2002, TREC.

[7]  Yi Zhang,et al.  YFilter at TREC-9 , 2000, TREC.

[8]  Yi Zhang,et al.  Maximum likelihood estimation for filtering thresholds , 2001, SIGIR '01.

[9]  Yiming Yang,et al.  Combining Multiple Learning Strategies for Effective Cross Validation , 2000, ICML.

[10]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[11]  Avi Arampatzis,et al.  The score-distributional threshold optimization for adaptive binary classification tasks , 2001, SIGIR '01.

[12]  Yiming Yang,et al.  Expert network: effective and efficient learning from human decisions in text categorization and retrieval , 1994, SIGIR '94.

[13]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[14]  Yiming Yang,et al.  A study of thresholding strategies for text categorization , 2001, SIGIR '01.

[15]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[16]  Xuanjing Huang,et al.  FDU at TREC-9: CLIR, Filtering and QA Tasks , 2000, TREC.

[17]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[18]  Yiming Yang,et al.  kNN at TREC-9 , 2000, TREC.

[19]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[20]  Stephen E. Robertson,et al.  The TREC-9 filtering track , 1999, SIGF.