Low-resource speech translation of Urdu to English using semi-supervised part-of-speech tagging and transliteration

This paper describes the construction of ASR and MT systems for translation of speech from Urdu into English. As both Urdu pronunciation lexicons and Urdu-English bitexts are sparse, we employ several techniques that make use of semi-supervised annotation to improve ASR and MT training. Specifically, we describe 1) the construction of a semi-supervised HMM-based part-of-speech tagger that is used to train factored translation models and 2) the use of an HMM-based transliterator from which we derive a spelling-to-pronunciation model for Urdu used in ASR training. We describe experiments performed for both ASR and MT training in the context of the Urdu-to-English task of the NIST MT08 Evaluation and we compare methods making use of additional annotation with standard statistical MT and ASR baselines.

[1]  Richard Zens,et al.  The JHU workshop 2006 IWSLT system , 2006, IWSLT.

[2]  Bernard Mérialdo,et al.  Tagging English Text with a Probabilistic Model , 1994, CL.

[3]  Kishore Prahallad,et al.  Significance of early tagged contextual graphemes in grapheme based speech synthesis and recognition systems , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  Eric Fosler-Lussier,et al.  Multi-level decision trees for static and dynamic pronunciation models , 1999, EUROSPEECH.

[6]  Philipp Koehn,et al.  Factored Translation Models , 2007, EMNLP.

[7]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[8]  Penelope Sibun,et al.  A Practical Part-of-Speech Tagger , 1992, ANLP.

[9]  Q.I. Wang,et al.  Improved estimation for unsupervised part-of-speech tagging , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[10]  Julian M. Kupiec,et al.  Robust part-of-speech tagging using a hidden Markov model , 1992 .

[11]  Eric Brill,et al.  Unsupervised Learning of Disambiguation Rules for Part of Speech Tagging , 1995, VLC@ACL.

[12]  Daniel Povey,et al.  Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..

[13]  Douglas A. Reynolds,et al.  A comparison of speaker clustering and speech recognition techniques for air situational awareness , 2007, INTERSPEECH.

[14]  Fred Popowich,et al.  Automatic Transliteration of Proper Nouns from Arabic to English , 2006, BCS.

[15]  Daniel Povey,et al.  Large scale discriminative training for speech recognition , 2000 .