Session 3: Human Language Evaluation

* Cross-system evaluation: This is a mainstay of the periodic ARPA evaluations on competing systems. Multiple sites agree to run their respective systems on a single application, so that results across systems are comparable. This includes evaluations such as message understanding (MUC)[6], information retrieval (TREC)[7], spoken language systems (ATIS)[8], and automated speech recognition (CSR)[8].