Word segmentation through cross-lingual word-to-phoneme alignment

We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of word segmentation is more accurate when information of another language is used. Word segmentation with cross-lingual information is highly relevant to bootstrap pronunciation dictionaries from audio data for Automatic Speech Recognition, bypass the written form in Speech-to-Speech Translation or build the vocabulary of an unseen language, particularly in the context of under-resourced languages. Using Model 3P for the alignment between English words and Spanish phonemes outperforms a state-of-the-art monolingual word segmentation approach [1] on the BTEC corpus [2] by up to 42% absolute in F-Score on the phoneme level and a GIZA++ alignment based on IBM Model 3 by up to 17%.

[1]  Mark Johnson,et al.  Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars , 2009, NAACL.

[2]  Bowen Zhou,et al.  TOWARDS SPEECH TRANSLATION OF NON WRITTEN LANGUAGES , 2006, 2006 IEEE Spoken Language Technology Workshop.

[3]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[4]  Tanja Schultz,et al.  Multilingual Speech Processing , 2006 .

[5]  Suzanne Romaine,et al.  Vanishing Voices: The Extinction of the World's Languages , 2000 .

[6]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[7]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[8]  Andreas Zell,et al.  The EvA2 Optimization Framework , 2010, LION.

[9]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[10]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[11]  L. Joan Vanishing Voices: The Extinction of the World's Languages. , 2004 .

[12]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[13]  Chunyu Kit,et al.  Unsupervised lexical learning as inductive inference. , 2000 .

[14]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[15]  Eiichiro Sumita,et al.  Creating corpora for speech-to-speech translation , 2003, INTERSPEECH.

[16]  Sebastian Stüker,et al.  Towards human translations guided language discovery for ASR systems , 2008, SLTU.

[17]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[18]  Mark Johnson,et al.  Using Adaptor Grammars to Identify Synergies in the Unsupervised Acquisition of Linguistic Structure , 2008, ACL.

[19]  John Goldsmith,et al.  An algorithm for the unsupervised learning of morphology , 2006, Natural Language Engineering.

[20]  Kevin Knight A Statistical MT Tutorial Workbook , 2003 .

[21]  Barbara F. Grimes Ethnologue Languages of the World , 1988 .

[22]  Tanja Schultz,et al.  Globalphone: a multilingual speech and text database developed at karlsruhe university , 2002, INTERSPEECH.