DARPA February 1992 Pilot Corpus CSR "Dry Run" Benchmark Test Results

Continuous speech recognition research activities within the DARPA Spoken Language community have, within the past several years, been focussed on the Resource Management (RM) and Air Travel Information System (ATIS) corpora. Within the past year, plans have been developed for a large, multi-component "general-purpose English, large vocabulary, natural language, high perplexity corpus" known as the DARPA [Wall Street Journal-based] Continuous speech Recognition (CSR) Corpus [1]. Doug Paul, of MIT Lincoln Laboratory (MIT/LL), and Janet Baker, of Dragon Systems, are responsible for many of the details of these plans. This corpus is intended to supplant the RM corpora and to supplement the ATIS corpora as resources for the DARPA speech recognition research community.