Open and extendable speech recognition application architecture for mobile environments

This paper describes a cloud-based speech recognition architecture primarily intended for mobile environments. The system consists of a speech recognition server and a client for the Android mobile operating system. The system enables to implement Android’s Voice Input functionality for languages that are not supported by the default Google implementation. The architecture supports both large vocabulary speech recognition as well as grammar-based recognition, where grammars can be implemented in JSGF or Grammatical Framework. The system is open source and easily extendable. We used the architecture to implement Estonian speech recognition for Android.