Spoltech and OGI-22 Baseline Systems for Speech Recognition in Brazilian Portuguese

Speech processing is a data-driven technology that relies on public corpora and associated resources. In contrast to languages such as English, there are few resources for Brazilian Portuguese (BP). This work describes efforts toward decreasing such gap and presents systems for speech recognition in BP using two public corpora: Spoltech and OGI-22. The following resources are made available: HTK scripts, pronunciation dictionary, language and acoustic models. The work discusses the baselineresults obtained with these resources.