Spoken language translation graphs re-decoding using automatic quality assessment

This paper investigates how automatic quality assessment of spoken language translation (SLT), also named confidence estimation (CE), can help re-decoding SLT output graphs and improve the overall speech translation performance. Our graph redecoding method can be seen as a second-pass of translation. For this, a robust word confidence estimator for SLT is required. We propose several estimators based on our estimation of transcription (ASR) quality, translation (MT) quality, or both (combined ASR+MT). Using these word confidence measures to re-decode the spoken language translation graph leads to a significant BLEU improvement (more than 2 points) compared to our SLT baseline, for a French-English SLT task. These results could be applied to interactive speech translation or computer-assisted translation of speeches and lectures.

[1]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[2]  Matthew G. Snover,et al.  TERp System Description , 2008 .

[3]  Haizhou Li,et al.  Error Detection for Statistical Machine Translation Using Linguistic Features , 2010, ACL.

[4]  Sheryl R. Young,et al.  Recognition Confidence Measures: Detection of Misrecognitions and Out- Of-Vocabulary Words , 1994 .

[5]  Nizar Habash,et al.  Can Automatic Post-Editing Make MT More Meaningful , 2012, EAMT.

[6]  Benjamin Lecouteux,et al.  Word Confidence Estimation and Its Integration in Sentence Quality Estimation for Machine Translation , 2013, KSE.

[7]  Maxine Eskénazi,et al.  BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.

[8]  Patrick Gros,et al.  CRF-based combination of contextual features to improve a posteriori word-level confidence measures , 2010, INTERSPEECH.

[9]  Stephan Vogel,et al.  An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT , 2007, NAACL.

[10]  Benjamin Lecouteux,et al.  LIG System for WMT13 QE Task: Investigating the Usefulness of Features in Word Confidence Estimation for MT , 2013, WMT@ACL.

[11]  Hervé Blanchon,et al.  The LIG Machine Translation System for WMT 2010 , 2010, WMT@ACL.

[12]  Georges Linarès,et al.  Combined low level and high level features for out-of-vocabulary word detection , 2009, INTERSPEECH.

[13]  Lidia S. Chao,et al.  Quality Estimation for Machine Translation Using the Joint Method of Evaluation Criteria and Statistical Modeling , 2013, WMT@ACL.

[14]  François Yvon,et al.  Practical Very Large Scale CRFs , 2010, ACL.

[15]  Hervé Blanchon,et al.  Collection of a Large Database of French-English SMT Output Corrections , 2012, LREC.

[16]  Ergun Biçici Referential Translation Machines for Quality Estimation , 2013, WMT@ACL.

[17]  Ying Zhang,et al.  Distributed Language Modeling for N-best List Re-ranking , 2006, EMNLP.

[18]  Guillaume Gravier,et al.  Corpus description of the ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News , 2004, LREC.

[19]  Laurent Besacier,et al.  An efficient two-pass decoder for SMT using word confidence estimation , 2014, EAMT.

[20]  Yaser Al-Onaizan,et al.  Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[21]  Richard M. Schwartz,et al.  Automatic Detection Of New Words In A Large Vocabulary Continuous Speech Recognition System , 1989, HLT.

[22]  Thomas Schaaf,et al.  Estimating confidence using word lattices , 1997, EUROSPEECH.

[23]  Benjamin Lecouteux,et al.  Word confidence estimation for speech translation , 2014, IWSLT.

[24]  Ashish Vaswani,et al.  Decoding with Large-Scale Neural Language Models Improves Translation , 2013, EMNLP.

[25]  Hermann Ney,et al.  N-Gram Posterior Probabilities for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[26]  Christian Raymond,et al.  Boosting bonsai trees for efficient features combination: application to speaker role identification , 2014, INTERSPEECH.

[27]  John Makhoul,et al.  Automatic Detection Of New Words In A Large Vocabulary Continuous Speech Recognition System , 1989, HLT.

[28]  Shankar Kumar,et al.  Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.

[29]  Hermann Ney,et al.  Confidence measures for statistical machine translation , 2003, MTSUMMIT.

[30]  Benjamin Lecouteux,et al.  Word Confidence Estimation for SMT N-best List Re-ranking , 2014, HaCaT@EACL.

[31]  Kevin Duh,et al.  Beyond Log-Linear Models: Boosted Minimum Error Rate Training for N-best Re-ranking , 2008, ACL.

[32]  Alon Lavie,et al.  Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.

[33]  Sheryl R. Young,et al.  Detecting misrecognitions and out-of-vocabulary words , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[34]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[35]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .