OpusFilter: A Configurable Parallel Corpus Filtering Toolbox
暂无分享,去创建一个
[1] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.
[2] Philipp Koehn,et al. Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions , 2019, WMT.
[3] Huda Khayrallah,et al. On the Impact of Various Types of Noise on Neural Machine Translation , 2018, NMT@ACL.
[4] Jörg Tiedemann,et al. Efficient Word Alignment with Markov Chain Monte Carlo , 2016, Prague Bull. Math. Linguistics.
[5] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[6] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.
[7] Mikel L. Forcada,et al. ParaCrawl: Web-scale parallel corpora for the languages of the EU , 2019, MTSummit.
[8] Jörg Tiedemann,et al. Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.
[9] Huda Khayrallah,et al. Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering , 2018, WMT.
[10] Teemu Hirsimäki,et al. On Growing and Pruning Kneser–Ney Smoothed $ N$-Gram Models , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Philipp Koehn,et al. Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora , 2017, EMNLP.
[12] Jörg Tiedemann,et al. The University of Helsinki Submission to the WMT19 Parallel Corpus Filtering Task , 2019, WMT.
[13] Keith Stevens,et al. Effective Parallel Corpus Mining using Bilingual Sentence Embeddings , 2018, WMT.
[14] Víctor M. Sánchez-Cartagena,et al. Prompsit’s submission to WMT 2018 Parallel Corpus Filtering shared task , 2018, WMT.