Semi-supervised G2p bootstrapping and its application to ASR for a very under-resourced language: Iban

This paper describes our experiments and results on using a local dominant language in Malaysia (Malay), to bootstrap automatic speech recognition (ASR) for a very under-resourced language: Iban (also spoken in Malaysia on the Borneo Island part). Resources in Iban for building a speech recognition were nonexistent. For this, we tried to take advantage of a language from the same family with several similarities. First, to deal with the pronunciation dictionary, we proposed a bootstrapping strategy to develop an Iban pronunciation lexicon from a Malay one. A hybrid version, mix of Malay and Iban pronunciations, was also built and evaluated. Following this, we experimented with three Iban ASRs; each depended on either one of the three different pronunciation dictionaries: Malay, Iban or hybrid.

[1]  Minematsu Nobuaki,et al.  Evaluations of an Open Source WFST-based Phoneticizer , 2011 .

[2]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[3]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[4]  W. Heeringa,et al.  The origin of the Afrikaans pronunciation: a comparison to West Germanic languages and Dutch dialects , 2008 .

[5]  Tanja Schultz,et al.  Automatic speech recognition for under-resourced languages: A survey , 2014, Speech Commun..

[6]  Grzegorz Kondrak,et al.  Online discriminative training for grapheme-to-phoneme conversion , 2009, INTERSPEECH.

[7]  Vaibhava Goel,et al.  Segmental minimum Bayes-risk decoding for automatic speech recognition , 2004, IEEE Transactions on Speech and Audio Processing.

[8]  Ramesh A. Gopinath,et al.  Maximum likelihood modeling with Gaussian distributions for classification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[9]  Steven Bird,et al.  Phonology , 2002, ArXiv.

[10]  K. Adelaar,et al.  The Austronesian languages of Asia and Madagascar: a historical perspective , 2005 .

[11]  Mark Liberman,et al.  Transcriber: a free tool for segmenting, labeling and transcribing speech , 1998, LREC.

[12]  Mark Liberman,et al.  Transcriber: Development and use of a tool for assisting speech corpora production , 2001, Speech Commun..

[13]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[14]  Haizhou Li,et al.  MASS: A Malay language LVCSR corpus resource , 2009, 2009 Oriental COCOSDA International Conference on Speech Database and Assessments.

[15]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[16]  Hermann Ney,et al.  Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[17]  Laurent Besacier,et al.  Fast Bootstrapping of Grapheme to Phoneme System for Under-resourced Languages - Application to the Iban Language , 2013 .

[18]  Marelie H. Davel,et al.  Pronunciation dictionary development in resource-scarce environments , 2009, INTERSPEECH.