A new agenda for corpus linguistics - working with all of the world's languages

In this paper we argue that corpus linguistics needs to expand to cover a wider set of languages. While the reasons that some languages have not been provided with corpus data to the date are clear, the intellectual and moral imperative to extend the range of corpus linguistics is strong. However, there are technical problems to be faced in such an extension of corpus linguistics. These problems are reviewed here and possible solutions to them explored. Following on from this, we consider what possible benefits the provision of appropriate corpus data may bring to languages currently untouched by the development of corpus linguistics.