Speech Recognition, Machine Translation, and Speech Translation—A Unified Discriminative Learning Paradigm [Lecture Notes]

In the past two decades, significant progress has been made in automatic speech recognition (ASR) [2], [9] and statistical machine translation (MT) [12]. Despite some conspicuous differences, many problems in ASR and MT are closely related and techniques in the two fields can be successfully cross-pollinated. In this lecture note, we elaborate on the fundamental connections between ASR and MT, and show that the unified ASR discriminative training paradigm recently developed and presented in [7] can be extended to train MT models in the same spirit.

[1]  NeyHermann,et al.  A systematic comparison of various statistical alignment models , 2003 .

[2]  Richard M. Schwartz,et al.  Combining Outputs from Multiple Machine Translation Systems , 2007, NAACL.

[3]  James R. Glass,et al.  Developments and directions in speech recognition and understanding, Part 1 [DSP Education] , 2009, IEEE Signal Processing Magazine.

[4]  Wu Chou,et al.  Discriminative learning in sequential pattern recognition , 2008, IEEE Signal Processing Magazine.

[5]  Scott Axelrod,et al.  Discriminative Estimation of Subspace Constrained Gaussian Mixture Models for Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  J. Treichler Signal processing: A view of the future, Part 2 [Exploratory DSP] , 2009, IEEE Signal Processing Magazine.

[7]  William J. Byrne,et al.  Discriminative speaker adaptation with conditional maximum likelihood linear regression , 2001, INTERSPEECH.

[8]  John Hutchins,et al.  From First Conception to First Demonstration: the Nascent Years of Machine Translation, 1947–1954. A Chronology , 1998, Machine Translation.

[9]  Ben Taskar,et al.  An End-to-End Discriminative Approach to Machine Translation , 2006, ACL.

[10]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[11]  Xiaodong He,et al.  Discriminative Learning for Speech Recognition: Theory and Practice , 2008, Discriminative Learning for Speech Recognition.

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[13]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[14]  Hermann Ney,et al.  Integrating Speech Recognition and Machine Translation: Where do We Stand? , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[15]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[16]  Li Deng,et al.  An Overview of Modern Speech Recognition , 2010, Handbook of Natural Language Processing.

[17]  Dong Yu,et al.  Large-Margin Minimum Classification Error Training for Large-Scale Speech Recognition Tasks , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[18]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[19]  Taro Watanabe,et al.  A Unified Approach in Speech-to-Speech Translation: Integrating Features of Speech recognition and Machine Translation , 2004, COLING.

[20]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[21]  Li Deng,et al.  A novel decision function and the associated decision-feedback learning for speech translation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  James Glass,et al.  Research Developments and Directions in Speech Recognition and Understanding, Part 1 , 2009 .

[23]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[24]  Hermann Ney,et al.  Speech translation: coupling of recognition and translation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[25]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[26]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[27]  Sebastian Stüker,et al.  Overview of the IWSLT 2010 evaluation campaign , 2010, IWSLT.

[28]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[29]  Miles Osborne,et al.  Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.