Semi-supervised Annotator of Speech Corpora and AGH Speech Corpus of Polish

Software to generate professional speech corpora using audiobooks and corresponding text books is presented. The software allows the creation of speech corpora much faster and cheaper than traditional methods. Existing speech resources of Polish are described with a brief introduction to Polish dialects. An example of a small corpus of Polish made with the described tool is presented as well.