Utilisation de mesures de confiance pour améliorer le décodage en traduction de parole

Les mesures de confiance au niveau mot (Word Confidence Estimation-WCE) pour la traduction auto-matique (TA) ou pour la reconnaissance automatique de la parole (RAP) attribuent un score de confiance a chaque mot dans une hypothese de transcription ou de traduction. Dans le passe, l'estimation de ces mesures a le plus souvent ete traitee separement dans des contextes RAP ou TA. Nous proposons ici une estimation conjointe de la confiance associee a un mot dans une hypothese de traduction automatique de la parole (TAP). Cette estimation fait appel a des parametres issus aussi bien des systemes de transcription de la parole (RAP) que des systemes de traduction automatique (TA). En plus de la construction de ces estimateurs de confiance robustes pour la TAP, nous utilisons les informations de confiance pour re-decoder nos graphes d'hypotheses de traduction. Les experimentations realisees montrent que l'utilisation de ces mesures de confiance au cours d'une seconde passe de decodage permettent d'obtenir une amelioration significative des performances de traduction (evaluees avec la metrique BLEU-gains de deux points par rapport a notre systeme de traduc-tion de parole de reference). Ces experiences sont faites pour une tâche de TAP (francais-anglais) pour laquelle un corpus a ete specialement concu (ce corpus, mis a la disposition de la communaute TALN, est aussi decrit en detail dans l'article). Abstract. Word Confidence Estimation (WCE) for machine translation (MT) or automatic speech recognition (ASR) assigns a confidence score to each word in the MT or ASR hypothesis. In the past, this task has been treated separately in ASR or MT contexts and we propose here a joint estimation of word confidence for a spoken language translation (SLT) task involving both ASR and MT. We build robust word confidence estimators for SLT, based on joint ASR and MT features. Using these word confidence measures to re-decode the spoken language translation graph leads to a significant BLEU improvement (2 points) compared to the SLT baseline. These experiments are done for a French-English SLT task for which a corpus was specifically designed (this corpus being made available to the NLP community). Mots-cles : Mesures de confiance, traduction automatique de la parole, parametres joints, re-decodage de graphe.

[1]  Philip C. Woodland,et al.  Using Sub-word-level Information for Confidence Estimation with Conditional Random Field Models , 2012, INTERSPEECH.

[2]  Ergun Biçici Referential Translation Machines for Quality Estimation , 2013, WMT@ACL.

[3]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[4]  Haizhou Li,et al.  Error Detection for Statistical Machine Translation Using Linguistic Features , 2010, ACL.

[5]  Ying Zhang,et al.  Distributed Language Modeling for N-best List Re-ranking , 2006, EMNLP.

[6]  Yaser Al-Onaizan,et al.  Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[7]  Hermann Ney,et al.  Confidence measures for statistical machine translation , 2003, MTSUMMIT.

[8]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[9]  Benjamin Lecouteux,et al.  Word Confidence Estimation for SMT N-best List Re-ranking , 2014, HaCaT@EACL.

[10]  Kevin Duh,et al.  Beyond Log-Linear Models: Boosted Minimum Error Rate Training for N-best Re-ranking , 2008, ACL.

[11]  Richard M. Schwartz,et al.  Automatic Detection Of New Words In A Large Vocabulary Continuous Speech Recognition System , 1989, HLT.

[12]  Rong Zhang,et al.  Word level confidence annotation using combinations of features , 2001, INTERSPEECH.

[13]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[14]  Matthew G. Snover,et al.  TERp System Description , 2008 .

[15]  Lin Lawrance Chase Error-responsive feedback mechanisms for speech recognizers , 1997 .

[16]  Hermann Ney,et al.  N-Gram Posterior Probabilities for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[17]  Mitch Weintraub,et al.  Neural-network based measures of confidence for word recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Christian Raymond,et al.  Boosting bonsai trees for efficient features combination: application to speaker role identification , 2014, INTERSPEECH.

[19]  Philip C. Woodland,et al.  Combining Information Sources for Confidence Estimation with CRF Models , 2011, INTERSPEECH.

[20]  Guillaume Gravier,et al.  Corpus description of the ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News , 2004, LREC.

[21]  Alfons Juan-Císcar,et al.  A Word-Based Naïve Bayes Classifier for Confidence Estimation in Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Lidia S. Chao,et al.  Quality Estimation for Machine Translation Using the Joint Method of Evaluation Criteria and Statistical Modeling , 2013, WMT@ACL.

[23]  François Yvon,et al.  Practical Very Large Scale CRFs , 2010, ACL.

[24]  Patrick Gros,et al.  CRF-based combination of contextual features to improve a posteriori word-level confidence measures , 2010, INTERSPEECH.

[25]  Stephan Vogel,et al.  An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT , 2007, NAACL.

[26]  Benjamin Lecouteux,et al.  LIG System for WMT13 QE Task: Investigating the Usefulness of Features in Word Confidence Estimation for MT , 2013, WMT@ACL.

[27]  Hervé Blanchon,et al.  The LIG Machine Translation System for WMT 2010 , 2010, WMT@ACL.

[28]  Shankar Kumar,et al.  Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.

[29]  Thomas Schaaf,et al.  Estimating confidence using word lattices , 1997, EUROSPEECH.

[30]  Laurent Besacier,et al.  An efficient two-pass decoder for SMT using word confidence estimation , 2014, EAMT.

[31]  Benjamin Lecouteux,et al.  Word Confidence Estimation and Its Integration in Sentence Quality Estimation for Machine Translation , 2013, KSE.

[32]  Maxine Eskénazi,et al.  BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.

[33]  Sheryl R. Young,et al.  Recognition Confidence Measures: Detection of Misrecognitions and Out- Of-Vocabulary Words , 1994 .

[34]  Nizar Habash,et al.  Can Automatic Post-Editing Make MT More Meaningful , 2012, EAMT.

[35]  Georges Linarès,et al.  Combined low level and high level features for out-of-vocabulary word detection , 2009, INTERSPEECH.

[36]  Hervé Blanchon,et al.  Collection of a Large Database of French-English SMT Output Corrections , 2012, LREC.