Word confidence estimation for speech translation

Word Confidence Estimation (WCE) for machine transla-tion (MT) or automatic speech recognition (ASR) consists in judging each word in the (MT or ASR) hypothesis as correct or incorrect by tagging it with an appropriate label. In the past, this task has been treated separately in ASR or MT con-texts and we propose here a joint estimation of word confi-dence for a spoken language translation (SLT) task involving both ASR and MT. This research work is possible because we built a specific corpus which is first presented. This cor-pus contains 2643 speech utterances for which a quintuplet containing: ASR output (src-asr), verbatim transcript (src-ref), text translation output (tgt-mt), speech translation out-put (tgt-slt) and post-edition of translation (tgt-pe), is made available. The rest of the paper illustrates how such a corpus (made available to the research community) can be used for evaluating word confidence estimators in ASR, MT or SLT scenarios. WCE for SLT could help rescoring SLT output graphs, improving translators productivity (for translation of lectures or movie subtitling) or it could be useful in interac-tive speech-to-speech translation scenarios. Word confidence estimation (WCE), Spoken Language Translation (SLT), Corpus, Joint features.

[1]  Richard M. Schwartz,et al.  Automatic Detection Of New Words In A Large Vocabulary Continuous Speech Recognition System , 1989, HLT.

[2]  Gosse Bouma,et al.  48th Annual Meeting of the Association for Computational Linguistics , 2010, ACL 2010.

[3]  Lidia S. Chao,et al.  Quality Estimation for Machine Translation Using the Joint Method of Evaluation Criteria and Statistical Modeling , 2013, WMT@ACL.

[4]  François Yvon,et al.  Practical Very Large Scale CRFs , 2010, ACL.

[5]  Laurent Besacier,et al.  An efficient two-pass decoder for SMT using word confidence estimation , 2014, EAMT.

[6]  Christian Raymond,et al.  Boosting bonsai trees for efficient features combination: application to speaker role identification , 2014, INTERSPEECH.

[7]  Thomas Schaaf,et al.  Estimating confidence using word lattices , 1997, EUROSPEECH.

[8]  Matthew G. Snover,et al.  TERp System Description , 2008 .

[9]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10]  Hermann Ney,et al.  Confidence measures for statistical machine translation , 2003, MTSUMMIT.

[11]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[12]  Benjamin Lecouteux,et al.  Word Confidence Estimation for SMT N-best List Re-ranking , 2014, HaCaT@EACL.

[13]  Hui Jiang,et al.  Confidence measures for speech recognition: A survey , 2005, Speech Commun..

[14]  Patrick Gros,et al.  CRF-based combination of contextual features to improve a posteriori word-level confidence measures , 2010, INTERSPEECH.

[15]  Benjamin Lecouteux,et al.  LIG System for WMT13 QE Task: Investigating the Usefulness of Features in Word Confidence Estimation for MT , 2013, WMT@ACL.

[16]  Hervé Blanchon,et al.  The LIG Machine Translation System for WMT 2010 , 2010, WMT@ACL.

[17]  Georges Linarès,et al.  Combined low level and high level features for out-of-vocabulary word detection , 2009, INTERSPEECH.

[18]  Hervé Blanchon,et al.  Collection of a Large Database of French-English SMT Output Corrections , 2012, LREC.

[19]  Guillaume Gravier,et al.  Corpus description of the ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News , 2004, LREC.

[20]  Sheryl R. Young,et al.  Recognition Confidence Measures: Detection of Misrecognitions and Out- Of-Vocabulary Words , 1994 .

[21]  Ergun Biçici Referential Translation Machines for Quality Estimation , 2013, WMT@ACL.

[22]  Yaser Al-Onaizan,et al.  Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[23]  Haizhou Li,et al.  Error Detection for Statistical Machine Translation Using Linguistic Features , 2010, ACL.

[24]  Benjamin Lecouteux,et al.  Word Confidence Estimation and Its Integration in Sentence Quality Estimation for Machine Translation , 2013, KSE.

[25]  Maxine Eskénazi,et al.  BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.