Automatic Pronunciation Dictionary Generation from Wiktionary and Wikipedia

In this work we show that dictionaries from the World Wide Web which contain phonetic notations may represent a good basis for the rapid pronunciation dictionary creation within the speech recognition and speech synthesis system building process. As a representative dictionary, we selected wiktionary.org [1] since it is available in multiple languages, and in addition to the definitions of the words many phonetic notations in characters of the International Phonetic Alphabet (IPA) are detectable. We checked the quantity of the pronunciations to vocabulary lists in five languages. Furthermore, a quality check was performed by comparing pronunciations of the dictionary from the World Wide Web to the pronunciations of dictionaries from the GlobalPhone project [2] which are commonly employed by the speech community. Paradigm languages are English, French, German, Spanish and Vietnamese. French wiktionary.org achieved best results as it included 92.580% pronunciations for GlobalPhone vocabulary as well as 33.333% and 76.119% for lists of international cities and countries. Finally, we are planning to integrate our work into the Rapid Language Adaptation Toolkit (RLAT). RLAT is a web based toolkit enabling naive users to create speech recognizers in any language [3].

[1]  Sanjeev Khudanpur,et al.  WEB-derived pronunciations , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Dirk Van Compernolle Recognizing speech of goats, wolves, sheep and ... non-natives , 2001, Speech Commun..

[3]  Ariadna Font Llitjós,et al.  Evaluation and collection of proper name pronunciations online , 2002, LREC.

[4]  William C. Hannas,et al.  Asia's Orthographic Dilemma , 1996 .

[5]  Alan W. Black,et al.  Issues in building general letter to sound rules , 1998, SSW.

[6]  E. Vajda Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet , 2000 .

[7]  Tanja Schultz,et al.  Polyphone decision tree specialization for language adaptation , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[8]  Ronald Rosenfeld,et al.  Improving trigram language modeling with the World Wide Web , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9]  Tanja Schultz,et al.  SPICE: web-based tools for rapid language adaptation in speech processing systems , 2007, INTERSPEECH.

[10]  Weblog Wikipedia,et al.  In Wikipedia the Free Encyclopedia , 2005 .

[11]  Tanja Schultz Multilinguale Spracherkennung: Kombination akustischer Modelle zur Portierung auf neue Sprachen , 2001 .

[12]  Tanja Schultz,et al.  Globalphone: a multilingual speech and text database developed at karlsruhe university , 2002, INTERSPEECH.