MUC/MET Evaluation Trends

During the course of the Tipster Program, evaluation methodology for information extraction developed as the technology progressed. Multiple task levels and multiple languages were successful targets of information extraction. Automated scoring and statistical significance algorithms were developed for use in scoring systems and for interannotator agreement measures. The scoring interface allowed both system developers and annotators to analyze errors and improve their work. This software and the marked datasets are now in the public domain. Future projects are being carried out based on simplifications indicated by the data, downstream applications, and tractability of scoring algorithms.