Statistical speech translation system based on voice recognition optimization using multimodal sources of knowledge and characteristics vectors

Abstract Synergic combination of different sources of knowledge is a key issue for the development of modern statistical translators. In this work, a speech translation statistical system that adds additional other-than-voice information in a voice translation system is presented. The additional information serves as a base for the log-linear combination of several statistical models. We describe the theoretical framework of the problem, summarize the overall architecture of the system, and show how the system is enhanced with the additional information. Our real prototype implements a real-time speech translation system from Spanish to English that is adapted to specific teaching-related environments.

[1]  Alex Waibel,et al.  The CMU statistical machine translation system , 2003, MTSUMMIT.

[2]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[3]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[4]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[5]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[6]  Francisco Casacuberta,et al.  The EuTrans Spoken Language Translation System , 2004, Machine Translation.

[7]  Hermann Ney,et al.  Algorithms for statistical translation of spoken language , 2000, IEEE Trans. Speech Audio Process..

[8]  Georg Heigold,et al.  The RWTH 2007 TC-STAR evaluation system for european English and Spanish , 2007, INTERSPEECH.

[9]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[10]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[11]  Hermann Ney,et al.  Word-Level Confidence Estimation for Machine Translation , 2007, CL.

[12]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[13]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[14]  Jon Ander Gomez Reconocimiento automático del habla: modelos y técnicas: modelos y técnicas , 1998 .

[15]  Fabio Pianesi,et al.  Architecture and Design Considerations in NESPOLE!: a Speech Translation System for E-commerce Applications , 2001, HLT.

[16]  Alex Kulesza,et al.  Confidence Estimation for Machine Translation , 2004, COLING.

[17]  Javier Macías Guarasa,et al.  Medidas de confianza en sistemas de diálogo , 2004, Proces. del Leng. Natural.

[18]  Francisco Casacuberta,et al.  MONOTONE STATISTICAL TRANSLATION USING WORD GROUPS , 2001 .

[19]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[20]  John Sturdy DeNero,et al.  Phrase Alignment Models for Statistical Machine Translation , 2010 .

[21]  Alexander Maier,et al.  Speech-enabled windows application using Microsoft SAPI , 2006 .

[22]  David G. Stork,et al.  Pattern Classification , 1973 .

[23]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[24]  Robert L. Mercer,et al.  Automatic speech recognition in machine-aided translation , 1994, Comput. Speech Lang..

[25]  Taro Watanabe,et al.  A Unified Approach in Speech-to-Speech Translation: Integrating Features of Speech recognition and Machine Translation , 2004, COLING.

[26]  Francisco Casacuberta,et al.  Statistical Machine Translation Decoding Using Target Word Reordering , 2004, SSPR/SPR.

[27]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[28]  E. Vidal,et al.  Estimation of confidence measures for machine translation , 2007, MTSUMMIT.

[29]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[30]  Franz Josef Och,et al.  Statistical machine translation: from single word models to alignment templates , 2002 .

[31]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[32]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[33]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[34]  Wolfgang Macherey,et al.  Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[35]  Hermann Ney,et al.  Speech-to-speech translation based on finite-state transducers , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[36]  Alexander H. Waibel,et al.  Decoding Algorithm in Statistical Machine Translation , 1997, ACL.

[37]  Hermann Ney,et al.  Statistical multi-source translation , 2001, MTSUMMIT.

[38]  Marc Dymetman,et al.  Towards an automatic dictation system for translators : the transtalk project , 1994, ICSLP.

[39]  Hermann Ney,et al.  Speech translation: coupling of recognition and translation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[40]  Francisco Casacuberta,et al.  The ITI Statistical Machine Translation System , 2006 .

[41]  Hermann Ney,et al.  Some approaches to statistical and finite-state speech-to-speech translation , 2004, Comput. Speech Lang..

[42]  Satoshi Nakamura,et al.  The ATR Multilingual Speech-to-Speech Translation System , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[43]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[44]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[45]  Francisco Casacuberta,et al.  Phrase-Based Alignment Models for Statistical Machine Translation , 2005, IbPRIA.