论文信息 - Bilingual recursive neural network based data selection for statistical machine translation - 字舞流文

Bilingual recursive neural network based data selection for statistical machine translation

Abstract Data selection is a widely used and effective solution to domain adaptation in statistical machine translation (SMT). The dominant methods are perplexity-based ones, which do not consider the mutual translations of sentence pairs and tend to select short sentences. In this paper, to address these problems, we propose bilingual semi-supervised recursive neural network data selection methods to differentiate domain-relevant data from out-domain data. The proposed methods are evaluated in the task of building domain-adapted SMT systems. We present extensive comparisons and show that the proposed methods outperform the state-of-the-art data selection approaches.

Lidia S. Chao | Derek F. Wong | Yi Lu | Yi Lu

[1] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[2] Daniele Falavigna,et al. Focusing language models for automatic speech recognition , 2012, IWSLT.

[3] Josef van Genabith,et al. Linguistically-augmented perplexity-based data selection for language models , 2015, Comput. Speech Lang..

[4] Liang Tian,et al. UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation , 2014, LREC.

[5] Alex Waibel,et al. Adaptation of the translation model for statistical machine translation based on information retrieval , 2005, EAMT.

[6] William D. Lewis,et al. Intelligent Selection of Language Model Training Data , 2010, ACL.

[7] Lidia S. Chao,et al. A Systematic Comparison of Data Selection Criteria for SMT Domain Adaptation , 2014, TheScientificWorldJournal.

[8] Ming Zhou,et al. Bilingually-constrained Phrase Embeddings for Machine Translation , 2014, ACL.

[9] Keh-Jiann Chen,et al. Chinese language model adaptation based on document classification and multiple domain-specific language models , 1997, EUROSPEECH.

[10] Jianfeng Gao,et al. Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.

[11] Eiichiro Sumita,et al. Method of Selecting Training Data to Build a Compact and Efficient Translation Model , 2008, IJCNLP.

[12] Daniel Marcu,et al. Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[13] Yoshua Bengio,et al. Neural Probabilistic Language Models , 2006 .

[14] Jorge Nocedal,et al. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[15] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[16] Andy Way,et al. Towards Using Web-Crawled Data for Domain Adaptation in Statistical Machine Translation , 2011, EAMT.

[17] Ivor W. Tsang,et al. Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[18] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19] Kevin Duh,et al. Adaptation Data Selection using Neural Language Models: Experiments in Machine Translation , 2013, ACL.

[20] Anoop Sarkar,et al. Mixing Multiple Translation Models in Statistical Machine Translation , 2012, ACL.

[21] Noah A. Smith,et al. A Simple, Fast, and Effective Reparameterization of IBM Model 2 , 2013, NAACL.

[22] Hao Liu,et al. Effective Selection of Translation Model Training Data , 2014, ACL.

[23] Erik Cambria,et al. Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article] , 2014, IEEE Computational Intelligence Magazine.

[24] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[25] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[26] Jianfeng Gao,et al. Toward a unified approach to statistical language modeling for Chinese , 2002, TALIP.

[27] Frank Keller,et al. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL , 2014, EMNLP.

[28] Philipp Koehn,et al. Experiments in Domain Adaptation for Statistical Machine Translation , 2007, WMT@ACL.

[29] Stephan Vogel,et al. Language Model Adaptation for Statistical Machine Translation via Structured Query Models , 2004, COLING.

[30] Rui Xia,et al. Feature Ensemble Plus Sample Selection: Domain Adaptation for Sentiment Classification , 2013, IEEE Intelligent Systems.

[31] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[32] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[33] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[34] Mauro Cettolo,et al. WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.

[35] Christoph Goller,et al. Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[36] Roland Kuhn,et al. Discriminative Instance Weighting for Domain Adaptation in Statistical Machine Translation , 2010, EMNLP.

[37] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[38] Isabel Trancoso,et al. Edit Distance: A New Data Selection Criterion for Domain Adaptation in SMT , 2013, RANLP.

[39] Yiming Wang,et al. Domain Adaptation for Medical Text Translation using Web Resources , 2014, WMT@ACL.

[40] Jeffrey Pennington,et al. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[41] Alexander H. Waibel,et al. Language Model Adaptation for Statistical Machine Translation Based on Information Retrieval , 2004, LREC.

[42] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[43] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.