Report of NEWS 2016 Machine Transliteration Shared Task

This report documents the Machine Transliteration Shared Task conducted as a part of the Named Entities Workshop (NEWS 2012), an ACL 2012 workshop. The shared task features machine transliteration of proper names from English to 11 languages and from 3 languages to English. In total, 14 tasks are provided. 7 teams participated in the evaluations. Finally, 57 standard and 1 non-standard runs are submitted, where diverse transliteration methodologies are explored and reported on the evaluation data. We report the results with 4 performance metrics. We believe that the shared task has successfully achieved its objective by providing a common benchmarking platform for the research community to evaluate the state-of-the-art technologies that benefit the future research and development.

[1]  Haizhou Li,et al.  Whitepaper of NEWS 2010 Shared Task on Transliteration Generation , 2010, NEWS@ACL.

[2]  Eiichiro Sumita,et al.  Rescoring a Phrase-based Machine Transliteration System with Recurrent Neural Network Language Models , 2012, NEWS@ACL.

[3]  Douglas W. Oard,et al.  The effect of bilingual term list size on dictionary-based cross-language information retrieval , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[4]  Yoh Okuno Applying mpaligner to Machine Transliteration with Japanese-Specific Heuristics , 2012, NEWS@ACL.

[5]  Tiejun Zhao,et al.  Syllable-based Machine Transliteration with Extra Phrase Features , 2012, NEWS@ACL.

[6]  Lemao Liu,et al.  Neural Network Transduction Models in Transliteration Generation , 2015, NEWS@ACL.

[7]  Jian Su,et al.  A Joint Source-Channel Model for Machine Transliteration , 2004, ACL.

[8]  Grzegorz Kondrak,et al.  Substring-Based Transliteration , 2007, ACL.

[9]  Dan Roth,et al.  Transliteration as Constrained Optimization , 2008, EMNLP.

[10]  Haizhou Li,et al.  Whitepaper of NEWS 2016 Shared Task on Machine Transliteration , 2016, NEWS@ACM.

[11]  Noah A. Smith,et al.  Transliteration by Sequence Labeling with Lattice Encodings and Reranking , 2012, NEWS@ACL.

[12]  Kevin Knight,et al.  Name Translation in Statistical Machine Translation - Learning When to Transliterate , 2008, ACL.

[13]  Lemao Liu,et al.  Target-Bidirectional Neural Models for Machine Transliteration , 2016, NEWS@ACM.

[14]  Haizhou Li,et al.  Report of NEWS 2010 Transliteration Generation Shared Task , 2010, NEWS@ACL.

[15]  Rui Wang,et al.  Statistical Machine Transliteration with Multi-to-Multi Joint Source Channel Model , 2011, NEWS@IJCNLP.

[16]  Joakim Nivre,et al.  Applying Neural Networks to English-Chinese Named Entity Transliteration , 2016, NEWS@ACM.

[17]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[18]  Eiichiro Sumita,et al.  Phrase-based Machine Transliteration , 2008, IJCNLP.

[19]  Lei Yao,et al.  Multiple System Combination for Transliteration , 2015, NEWS@ACL.

[20]  Berlin Chen,et al.  Generating phonetic cognates to handle named entities in English-Chinese cross-language spoken document retrieval , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[21]  Canasai Kruengkrai,et al.  Simple Discriminative Training for Machine Transliteration , 2011, NEWS@IJCNLP.

[22]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[23]  Wen-Lian Hsu,et al.  English-to-Chinese Machine Transliteration using Accessor Variety Features of Source Graphemes , 2011, NEWS@IJCNLP.

[24]  Jörg Tiedemann,et al.  Boosting English-Chinese Machine Transliteration via High Quality Alignment and Multilingual Resources , 2015, NEWS@ACL.

[25]  Grzegorz Kondrak,et al.  Integrating Joint n-gram Features into a Discriminative Training Framework , 2010, HLT-NAACL.

[26]  A. Kumaran,et al.  A generic framework for machine transliteration , 2007, SIGIR.

[27]  K. Saravanan,et al.  "They Are Out There, If You Know Where to Look": Mining Transliterations of OOV Query Terms for Cross-Language Information Retrieval , 2009, ECIR.

[28]  Grzegorz Kondrak,et al.  DirecTL: a Language Independent Approach to Transliteration , 2009, NEWS@IJCNLP.

[29]  Hermann Ney,et al.  Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[30]  Byung-Ju Kang English-Korean Automatic Transliteration/Back-transliteration System and Character Alignment , 2000 .

[31]  Tao Tao,et al.  Named Entity Transliteration with Comparable Corpora , 2006, ACL.

[32]  Yu-Chun Wang,et al.  NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches , 2015, NEWS@ACL.

[33]  Yu-Chun Wang,et al.  English-Korean Named Entity Transliteration Using Substring Alignment and Re-ranking Methods , 2012, NEWS@ACL.

[34]  Ying Qin,et al.  Forward-backward Machine Transliteration between English and Chinese Based on Combined CRFs , 2011, NEWS@IJCNLP.

[35]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[36]  Thomas Mandl,et al.  The effect of named entities on effectiveness in cross-language information retrieval evaluation , 2005, SAC '05.

[37]  Min Zhang,et al.  Whitepaper of NEWS 2012 Shared Task on Machine Transliteration , 2011, NEWS@ACL.

[38]  Jian Yang,et al.  A Hybrid Transliteration Model for Chinese/English Named Entities - BJTU-NLP Report for the 5th Named Entities Workshop , 2015, NEWS@ACL.

[39]  Marta R. Costa-jussà Moses-based official baseline for NEWS 2016 , 2016, NEWS@ACM.

[40]  Grzegorz Kondrak,et al.  Transliteration Experiments on Chinese and Arabic , 2012, NEWS@ACL.

[41]  Jack Halpern The Challenges and Pitfalls of Arabic Romanization and Arabization , 2007 .

[42]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[43]  Oi Yee Kwong English-Chinese Personal Name Transliteration by Syllable-Based Maximum Matching , 2011, NEWS@IJCNLP.

[44]  Kevin Knight,et al.  Machine Transliteration , 1997, CL.

[45]  Yaser Al-Onaizan,et al.  Machine Transliteration of Names in Arabic Texts , 2002, SEMITIC@ACL.

[46]  Pushpak Bhattacharyya,et al.  Data representation methods and use of mined corpora for Indian language transliteration , 2015, NEWS@ACL.

[47]  Yoav Goldberg,et al.  Identification of Transliterated Foreign Words in Hebrew Script , 2008, CICLing.

[48]  Dmitry Zelenko,et al.  Discriminative Methods for Transliteration , 2006, EMNLP.

[49]  Sanjeev Khudanpur,et al.  Transliteration of Proper Names in Cross-Lingual Information Retrieval , 2003, NER@ACL.

[50]  Key-Sun Choi,et al.  An English-Korean Transliteration Model Using Pronunciation and Contextual Rules , 2002, COLING.

[51]  Grzegorz Kondrak,et al.  Applying Many-to-Many Alignments and Hidden Markov Models to Letter-to-Phoneme Conversion , 2007, NAACL.

[52]  Wen-Lian Hsu,et al.  Cost-benefit Analysis of Two-Stage Conditional Random Fields based English-to-Chinese Machine Transliteration , 2012, NEWS@ACL.

[53]  Karin M. Verspoor,et al.  Automatic English-Chinese name transliteration for development of multilingual resources , 1998, ACL.

[54]  Eiichiro Sumita,et al.  Integrating Models Derived from non-Parametric Bayesian Co-segmentation into a Statistical Machine Transliteration System , 2011, NEWS@IJCNLP.

[55]  Wei Gao,et al.  Phoneme-Based Transliteration of Foreign Names for OOV Problem , 2004, IJCNLP.