Overview of the IWSLT 2010 evaluation campaign

This paper gives an overview of the evaluation campaign results of the 7th International Workshop on Spoken Language Translation (IWSLT 2010)1. This year, we focused on three spoken language tasks: (1) public speeches on a variety of topics (TALK) from English to French, (2) spoken dialog in travel situations (DIALOG) between Chinese and English, and (3) traveling expressions (BTEC) from Arabic, Turkish, and French to English. In total, 28 teams (including 7 firsttime participants) took part in the shared tasks, submitting 60 primary and 112 contrastive runs. Automatic and subjective evaluations of the primary runs were carried out in order to investigate the impact of different communication modalities, spoken language styles and semantic context on automatic speech recognition (ASR) and machine translation (MT) system performances.

[1]  Germán Sanchis-Trilles,et al.  ITI-UPV system description for IWSLT 2010 , 2010, IWSLT.

[2]  José B. Mariño,et al.  UPC-BMIC-VDU system description for the IWSLT 2010: testing several collocation segmentations in a phrase-based SMT system , 2010, IWSLT.

[3]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[4]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[5]  Wang Ling,et al.  The INESC-ID machine translation system for the IWSLT 2010 , 2010, IWSLT.

[6]  Hermann Ney,et al.  An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research , 2000, LREC.

[7]  Hui Yu,et al.  The ICT statistical machine translation system for IWSLT 2010 , 2010, IWSLT.

[8]  Eiichiro Sumita,et al.  The NICT translation system for IWSLT 2012 , 2012, IWSLT.

[9]  Khalil Sima'an,et al.  The ILLC-uva SMT system for IWSLT 2010 , 2010, IWSLT.

[10]  Hervé Blanchon,et al.  LIG statistical machine translation systems for IWSLT 2010 , 2010, IWSLT.

[11]  Jong-Hyeok Lee,et al.  The POSTECH's statistical machine translation system for the IWSLT 2010 , 2010, IWSLT.

[12]  Ying Zhang,et al.  Interpreting BLEU/NIST Scores: How Much Improvement do We Need to Have a Better System? , 2004, LREC.

[13]  Coskun Mermer,et al.  The TÜBÍTAK-UEKAE statistical machine translation system for IWSLT 2008. , 2008, IWSLT.

[14]  Christof Monz,et al.  The QMUL system description for IWSLT 2010 , 2010, IWSLT.

[15]  Mei-Yuh Hwang,et al.  The MSRA machine translation system for IWSLT 2010 , 2010, IWSLT.

[16]  Evgeny Matusov,et al.  AppTek’s APT machine translation system for IWSLT 2010 , 2010, IWSLT.

[17]  Paul Deléglise,et al.  LIUM’s statistical machine translation system for IWSLT 2009 , 2009, IWSLT.

[18]  Taro Watanabe,et al.  NTT statistical machine translation system for IWSLT 2008 , 2008, IWSLT.

[19]  Yves Lepage,et al.  The GREYC/LLACAN machine translation systems for the IWSLT 2010 campaign , 2010, IWSLT.

[20]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[21]  Michael Paul,et al.  Overview of the IWSLT 2009 evaluation campaign , 2009, IWSLT.

[22]  Nachum Dershowitz,et al.  Tel aviv university's system description for IWSLT 2010 , 2010, IWSLT.

[23]  Arianna Bisazza,et al.  Fbk @ Iwslt 2010 , 2010, IWSLT.

[24]  John S. White,et al.  The ARPA MT Evaluation Methodologies: Evolution, Lessons, and Future Approaches , 1994, AMTA.

[25]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[26]  Jin'ichi Murakami,et al.  Statistical pattern-based machine translation with statistical French-English machine translation , 2010, IWSLT.

[27]  Philipp Koehn,et al.  (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.

[28]  Hermann Ney,et al.  Evaluating Machine Translation Output with Automatic Sentence Segmentation , 2005, IWSLT.

[29]  Satoshi Nakamura,et al.  ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles , 2006, IEICE Trans. Inf. Syst..

[30]  Abdelmajid Ben Hamadou,et al.  The MIRACL Arabic-English statistical machine translation system for IWSLT 2010 , 2010, IWSLT.

[31]  Jan Niehues,et al.  The KIT translation system for IWSLT 2010 , 2010, IWSLT.

[32]  Andy Way,et al.  The DCU machine translation systems for IWSLT 2011 , 2010, IWSLT.

[33]  Joseph P. Turian,et al.  Evaluation of machine translation and its evaluation , 2003, MTSUMMIT.

[34]  Holger Schwenk,et al.  N-gram-based machine translation enhanced with neural networks , 2010, IWSLT.

[35]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[36]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[37]  Alex Kulesza,et al.  Confidence Estimation for Machine Translation , 2004, COLING.

[38]  Hermann Ney,et al.  The RWTH Aachen Machine Translation System for WMT 2010 , 2010, IWSLT.

[39]  Alexandre Allauzen,et al.  Limsi @ Iwslt 2010 , 2010, IWSLT.

[40]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.

[41]  Timothy R. Anderson,et al.  The MIT-LL/AFRL IWSLT-2006 MT system , 2006, IWSLT.