A preliminary study on corpus design for computer-assisted German and Mandarin language learning

This paper reports on the progress of a joint German-Taiwan computer assisted language learning (CALL) project. One major goal of this project is to collect a bi-lingual (both native and second language, i.e., L1 and L2) speech corpus of L2 learners of German and Mandarin across German and Taiwan. In the preparation phase of the database collection, contrastive analysis of German and Mandarin phonetic and prosodic systems is performed, and the potential pronunciation errors predicted to be made by L2 learners are hypothesized in a set of confusion tables. We expect to apply the set of confusable tables to database design. The eventual database collection will be conducted during the next three years.