English-to-Chinese Transliteration with Phonetic Auxiliary Task

Approaching named entities transliteration as a Neural Machine Translation (NMT) problem is common practice. While many have applied various NMT techniques to enhance machine transliteration models, few focus on the linguistic features particular to the relevant languages. In this paper, we investigate the effect of incorporating phonetic features for English-to-Chinese transliteration under the multi-task learning (MTL) setting—where we define a phonetic auxiliary task aimed to improve the generalization performance of the main transliteration task. In addition to our system, we also release a new English-toChinese dataset and propose a novel evaluation metric which considers multiple possible transliterations given a source name. Our results show that the multi-task model achieves similar performance as the previous state of the art with a model of a much smaller size.1

[1]  Le Sun,et al.  A Syllable-based Name Transliteration System , 2009, NEWS@IJCNLP.

[2]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[3]  Lior Wolf,et al.  Using the Output Embedding to Improve Language Models , 2016, EACL.

[4]  Imed Zitouni,et al.  Transliteration normalization for Information Extraction and Machine Translation , 2014, J. King Saud Univ. Comput. Inf. Sci..

[5]  Thomas Breuel,et al.  Sequence-to-sequence neural network models for transliteration , 2016, ArXiv.

[6]  Yuval Merhav,et al.  Design Challenges in Named Entity Transliteration , 2018, COLING.

[7]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[10]  Dianhai Yu,et al.  Multi-Task Learning for Multiple Language Translation , 2015, ACL.

[11]  Razvan Pascanu,et al.  How to Construct Deep Recurrent Neural Networks , 2013, ICLR.

[12]  Wei Xu,et al.  Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation , 2016, TACL.

[13]  Kevin Knight,et al.  Machine Transliteration , 1997, CL.

[14]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[15]  Rico Sennrich,et al.  Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[16]  Yoshua Bengio,et al.  Architectural Complexity Measures of Recurrent Neural Networks , 2016, NIPS.

[17]  Kenneth Heafield,et al.  Neural Machine Translation Techniques for Named Entity Transliteration , 2018, NEWS@ACL.

[18]  Wanxiang Che,et al.  Joint Optimization for Chinese POS Tagging and Dependency Parsing , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  Hitoshi Isahara,et al.  A machine transliteration model based on correspondence between graphemes and phonemes , 2006, TALIP.

[20]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[21]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[22]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[23]  Tetsuro Nishino,et al.  Translating unknown words using WordNet and IPA-based-transliteration , 2011, 14th International Conference on Computer and Information Technology (ICCIT 2011).

[24]  Rico Sennrich,et al.  Deep architectures for Neural Machine Translation , 2017, WMT.

[25]  Tao Tao,et al.  Named Entity Transliteration with Comparable Corpora , 2006, ACL.

[26]  Fatiha Sadat,et al.  Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages , 2018, NEWS@ACL.

[27]  Kevin Knight,et al.  Name Translation in Statistical Machine Translation - Learning When to Transliterate , 2008, ACL.

[28]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[29]  Ping Li,et al.  The Acquisition of Chinese Characters : Corpus Analyses and Connectionist Simulations , 2006 .

[30]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[31]  Mirella Lapata,et al.  Partners in Crime: Multi-view Sequential Inference for Movie Understanding , 2019, EMNLP.