The creation,distribution and use of linguistic data: the case of the linguistic data consortium

The Linguistic Data Consortium (LDC) is an open consortium of universities, companies and government research laboratories. It creates and distributes speech and text databases, lexicons and other resources. The University of Pennsylvania is the LDC’s host institution. The LDC was founded in 1992 with a grant from the Defense Advanced Research Projects Agency (DARPA). Currently, all LDC publication and distribution activities are self-supporting, while new data creation is partly supported by grant IRI 9528587 from the Information, Robotics and Intelligent Systems division of the National Science Foundation (NSF). The LDC’s core mission remains the support of pre-competitive research and development in speech and language technology, but support of other language-related research is also an important focus.