Speech technology-based framework for quantitative analysis of German spelling errors in freely composed children's texts

Reading and writing are core competencies of any society. In Germany, international and national comparative studies such as PISA (Programme for International Student Assessment) or PIRLS (Progress in International Reading Literacy Study IGLU in German) have shown that around 25% of German school children do not reach the minimal competence level necessary to function effectively in society by the age of 15. In order to teach writing to school children more effectively, a detailed analysis of their spelling errors can help in deriving individually tuned exercises. The work presented here forms the basis for frequently repeatable diagnosis and automatic error profiling on freely written text. We perform an automatic analysis on transcribed children’s texts, whereas the orthographically correct target is already known. The algorithm is able to identify 25 different types of errors defined by educators without manual intervention. The errors found were checked by the authors who agree with the completeness and correctness of the classified errors. The capability to automatically analyze spelling errors has not been achieved for the German language until now and the work presented here opens new perspectives on large scale data analysis about the development of written language in children that has previously not been possible for the German language.