Corpus of Marathi Word Frequencies from Touch-Screen Devices Using Swarachakra Android Keyboard

We describe and publish online a corpus containing word frequencies of Marathi texts that were actually typed by 27,474 users using the Android version of the Swarachakra Marathi keyboard on their mobile devices between August 2013 and September 2014. The corpus has 1,484,059 total words and 184,257 unique words. The paper also provides a preliminary analysis of the word frequencies and some comparisons with two existing corpora. It also provides a qualitative review of the nature of errors that users have made while typing and some idiosyncrasies that they have exhibited. We hope and expect that this corpus will be useful for future researchers, particularly those involved in word completion and auto-correction of user errors.