暂无分享,去创建一个
[1] Holger Schwenk,et al. Beyond English-Centric Multilingual Machine Translation , 2020, J. Mach. Learn. Res..
[2] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Jörg Tiedemann,et al. Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.
[4] Dragos Stefan Munteanu,et al. Improving Machine Translation Performance by Exploiting Non-Parallel Corpora , 2005, CL.
[5] Holger Schwenk,et al. WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia , 2019, EACL.
[6] Holger Schwenk,et al. On the Use of Comparable Corpora to Improve SMT performance , 2009, EACL.
[7] Naveen Arivazhagan,et al. Language-agnostic BERT Sentence Embedding , 2020, ArXiv.
[8] Thierry Etchegoyhen,et al. Weighted Set-Theoretic Alignment of Comparable Sentences , 2017, BUCC@ACL.
[9] Stefan Daniel Dumitrescu,et al. Wikipedia as an SMT Training Corpus , 2013, RANLP.
[10] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.
[11] Mikel L. Forcada,et al. Combining Content-Based and URL-Based Heuristics to Harvest Aligned Bitexts from Multilingual Sites with Bitextor , 2010, Prague Bull. Math. Linguistics.
[12] Simon Gottschalk,et al. MultiWiki , 2017, ACM Trans. Web.
[13] Pierre Zweigenbaum,et al. Overview of the Second BUCC Shared Task: Spotting Parallel Sentences in Comparable Corpora , 2017, BUCC@ACL.
[14] Timothy Baldwin,et al. Cross-domain Feature Selection for Language Identification , 2011, IJCNLP.
[15] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[16] Philippe Langlais,et al. Identifying Parallel Documents from a Large Bilingual Collection of Texts: Application to Parallel Article Extraction in Wikipedia. , 2011, BUCC@ACL.
[17] Lijun Wu,et al. Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.
[18] Jeff Johnson,et al. Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.
[19] Kristina Toutanova,et al. Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment , 2010, NAACL.
[20] Huda Khayrallah,et al. Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering , 2018, WMT.
[21] Mohammad Shoeybi,et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.
[22] Philipp Koehn,et al. Low-Resource Corpus Filtering Using Multilingual Sentence Embeddings , 2019, WMT.
[23] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[24] Ray Kurzweil,et al. Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax , 2019, IJCAI.
[25] Keith Stevens,et al. Effective Parallel Corpus Mining using Bilingual Sentence Embeddings , 2018, WMT.
[26] Thierry Etchegoyhen,et al. Set-Theoretic Alignment for Comparable Corpora , 2016, ACL.
[27] Marcin Junczys-Dowmunt,et al. The United Nations Parallel Corpus v1.0 , 2016, LREC.
[28] Tomas Mikolov,et al. Bag of Tricks for Efficient Text Classification , 2016, EACL.
[29] Myle Ott,et al. Facebook FAIR’s WMT19 News Translation Task Submission , 2019, WMT.
[30] Holger Schwenk,et al. Filtering and Mining Parallel Data in a Joint Multilingual Space , 2018, ACL.
[31] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[32] Josef van Genabith,et al. An Empirical Analysis of NMT-Derived Interlingual Embeddings and Their Use in Parallel Sentence Identification , 2017, IEEE Journal of Selected Topics in Signal Processing.
[33] Noah A. Smith,et al. The Web as a Parallel Corpus , 2003, CL.
[34] Graham Neubig,et al. Overview of the 5th Workshop on Asian Translation , 2019, PACLIC.
[35] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[36] Mehdi Mohammadi,et al. Building Bilingual Parallel Corpora Based on Wikipedia , 2010, 2010 Second International Conference on Computer Engineering and Applications.
[37] Ahmad Aghaebrahimian. Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction , 2018, COLING.
[38] Holger Schwenk,et al. Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings , 2018, ACL.
[39] Edouard Grave,et al. Reducing Transformer Depth on Demand with Structured Dropout , 2019, ICLR.
[40] Philipp Koehn,et al. Findings of the WMT 2016 Bilingual Document Alignment Shared Task , 2016, WMT.
[41] Eneko Agirre,et al. Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining , 2020, ACL.
[42] Marta R. Costa-jussà,et al. Findings of the 2019 Conference on Machine Translation (WMT19) , 2019, WMT.
[43] Philipp Koehn,et al. Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions , 2019, WMT.
[44] Dan Roth,et al. Cross-lingual Wikification Using Multilingual Embeddings , 2016, NAACL.
[45] Vishrav Chaudhary,et al. CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data , 2019, LREC.
[46] Isao Goto,et al. Overview of the 7th Workshop on Asian Translation , 2020, WAT.
[47] Holger Schwenk,et al. Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond , 2018, Transactions of the Association for Computational Linguistics.
[48] Philipp Koehn,et al. A Massive Collection of Cross-Lingual Web-Document Pairs , 2019, EMNLP.
[49] Philippe Langlais,et al. BUCC 2017 Shared Task: a First Attempt Toward a Deep Learning Framework for Identifying Parallel Sentences in Comparable Corpora , 2017, BUCC@ACL.
[50] Jörg Tiedemann,et al. OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.
[51] Prakhar Gupta,et al. Learning Word Vectors for 157 Languages , 2018, LREC.
[52] Meng Sun,et al. Baidu Neural Machine Translation Systems for WMT19 , 2019, WMT.
[53] Graham Neubig,et al. When and Why Are Pre-Trained Word Embeddings Useful for Neural Machine Translation? , 2018, NAACL.
[54] Philip Resnik,et al. Mining the Web for Bilingual Text , 1999, ACL.
[55] Maarten de Rijke,et al. Finding Similar Sentences across Multiple Languages in Wikipedia , 2006 .
[56] Pablo Gamallo Otero,et al. Wikipedia as Multilingual Source of Comparable Corpora , 2011 .
[57] Houda Bouamor,et al. H2@BUCC18: Parallel Sentence Extraction from Comparable Corpora Using Multilingual Sentence Embeddings , 2018, BUCC@LREC.