Bamana Reference Corpus (BRC)

Abstract The Bambara Reference Corpus (Corpus Bambara de Reference) is one of the first corpora for the languages of Africa south of Sahara of more than a million words, and probably the only one freely accessible on the Internet. The entire corpus is tone- marked, POS-tagged and glossed in French. In the paper, tools and resources developed for the Bambara Reference Corpus are surveyed and the process of corpus building is described.