Groupwise learning for ASR k-best list reranking in spoken language translation

Quality estimation models are used to predict the quality of the output from a spoken language translation (SLT) system. When these scores are used to rerank a k-best list, the rank of the scores is more important than their absolute values. This paper proposes groupwise learning to model this rank. Groupwise features were constructed by grouping pairs, triplets or M-plets among the ASR k-best outputs of the same sentence. Regression and classification models were learnt and a score combination strategy was used to predict the rank among the k-best list. Regression models with pairwise features give a bigger gain over other model and feature constructions. Groupwise learning is robust to sentences with different ASR-confidence. This technique is also complementary to linear discriminant analysis feature projection. An overall BLEU score improvement of 0.80 was achieved on an in-domain English-to-French SLT task.

[1]  Olivier Chapelle,et al.  Training a Support Vector Machine in the Primal , 2007, Neural Computation.

[2]  Lucia Specia,et al.  QuEst – Design, Implementation and Extensions of a Framework for Machine Translation Quality Estimation , 2013, Prague Bull. Math. Linguistics.

[3]  David Zhang,et al.  A Kernel Classification Framework for Metric Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Francisco Casacuberta,et al.  Potential scope of a fully-integrated architecture for speech translation , 2010, EAMT.

[5]  Mei-Yuh Hwang,et al.  The MSRA machine translation system for IWSLT 2010 , 2010, IWSLT.

[6]  Wei Chu,et al.  Support Vector Ordinal Regression , 2007, Neural Computation.

[7]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[8]  Philipp Koehn,et al.  Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.

[9]  Alon Lavie,et al.  Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[10]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[11]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[12]  Lucia Specia,et al.  A study on the stability and effectiveness of features in quality estimation for spoken language translation , 2015, INTERSPEECH.

[13]  Taro Watanabe,et al.  A Unified Approach in Speech-to-Speech Translation: Integrating Features of Speech recognition and Machine Translation , 2004, COLING.

[14]  Hermann Ney,et al.  Speech translation: coupling of recognition and translation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[15]  Alexandre Allauzen,et al.  LIMSI English-French speech translation system , 2014, IWSLT.

[16]  Andreas Fischer,et al.  Pairwise support vector machines and their application to large scale problems , 2012, J. Mach. Learn. Res..

[17]  Mauro Cettolo,et al.  Integrated n-best re-ranking for spoken language translation , 2005, INTERSPEECH.

[18]  George Saon,et al.  Lattice-based Viterbi decoding techniques for speech translation , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[19]  Tomoki Toda,et al.  An empirical comparison of joint optimization techniques for speech translation , 2013, INTERSPEECH.

[20]  Hermann Ney,et al.  Lattice-Based ASR-MT Interface for Speech Translation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[22]  Mauro Cettolo,et al.  WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.

[23]  Lucia Specia,et al.  Quality estimation for asr k-best list rescoring in spoken language translation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Lucia Specia,et al.  QuEst - A translation quality estimation framework , 2013, ACL.

[25]  Li Liao,et al.  Combining Pairwise Sequence Similarity and Support Vector Machines for Detecting Remote Protein Evolutionary and Structural Relationships , 2003, J. Comput. Biol..

[26]  Hermann Ney,et al.  On the integration of speech recognition and statistical machine translation , 2005, INTERSPEECH.

[27]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[28]  William Lewis,et al.  Adapting machine translation models toward misrecognized speech with text-to-speech pronunciation rules and acoustic confusability , 2015, INTERSPEECH.

[29]  Richard Zens,et al.  Efficient Speech Translation Through Confusion Network Decoding , 2008, IEEE Transactions on Audio, Speech, and Language Processing.