Exploring Segmentation Approaches for Neural Machine Translation of Code-Switched Egyptian Arabic-English Text
暂无分享,去创建一个
[1] Ngoc Thang Vu,et al. ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic-English , 2022, WANLP.
[2] Ngoc Thang Vu,et al. Investigating Lexical Replacements for Arabic-English Code-Switched Data Augmentation , 2022, LORESMT.
[3] Ngoc Thang Vu,et al. BPE vs. Morphological Segmentation: A Case Study on Machine Translation of Four Polysynthetic Languages , 2022, FINDINGS.
[4] Mona T. Diab,et al. CALCS 2021 Shared Task: Machine Translation for Code-Switched Data , 2022, ArXiv.
[5] Ngoc Thang Vu,et al. Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching Speech , 2021, Comput. Speech Lang..
[6] Marcin Junczys-Dowmunt,et al. To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation , 2021, WMT.
[7] Preethi Jyothi,et al. From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text , 2021, ACL.
[8] Franccois Yvon,et al. Can You Traducir This? Machine Translation for Code-Switched Input , 2021, CALCS.
[9] Jonne Saleva,et al. The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation , 2021, EACL.
[10] Kyunghyun Cho,et al. Neural machine translation with a polysynthetic low resource language , 2020, Machine Translation.
[11] Smaranda Muresan,et al. MorphAGram, Evaluation and Framework for Unsupervised Morphological Segmentation , 2020, LREC.
[12] Alexander Erdmann,et al. CAMeL Tools: An Open Source Python Toolkit for Arabic Natural Language Processing , 2020, LREC.
[13] Ngoc Thang Vu,et al. ArzEn: A Speech Corpus for Code-switched Egyptian Arabic-English , 2020, LREC.
[14] Ngoc Thang Vu,et al. Cairo Student Code-Switch (CSCS) Corpus: An Annotated Egyptian Arabic-English Corpus , 2020, LREC.
[15] Mayank Singh,et al. PHINC: A Parallel Hinglish Social Media Code-Mixed Corpus for Machine Translation , 2020, WNUT.
[16] Mikko Kurimo,et al. Morfessor EM+Prune: Improved Subword Segmentation with Expectation Maximization and Pruning , 2020, LREC.
[17] Yating Yang,et al. Morphological Word Segmentation on Agglutinative Languages for Neural Machine Translation , 2020, ArXiv.
[18] Sivaji Bandyopadhyay,et al. Code-Mixed to Monolingual Translation Framework , 2019, FIRE.
[19] Ahmed Y. Tawfik,et al. Morphology-aware Word-Segmentation in Dialectal Arabic Adaptation of Neural Machine Translation , 2019, WANLP@ACL 2019.
[20] Nizar Habash,et al. The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation , 2019, MTSummit.
[21] Kamel Smaïli,et al. Machine Translation on a Parallel Code-Switched Corpus , 2019, Canadian AI.
[22] Yue Zhang,et al. Code-Switching for Enhancing NMT with Pre-Specified Translation , 2019, NAACL.
[23] Katharina Kann,et al. Subword-Level Language Identification for Intra-Word Code-Switching , 2019, NAACL.
[24] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[25] Philipp Koehn,et al. Two New Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English , 2019, ArXiv.
[26] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[27] Kemal Oflazer,et al. The MADAR Arabic Dialect Corpus and Lexicon , 2018, LREC.
[28] Slim Abdennadher,et al. Collection and Analysis of Code-switch Egyptian Arabic-English Speech Corpus , 2018, LREC.
[29] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[30] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[31] Marcello Federico,et al. Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English , 2017, Prague Bull. Math. Linguistics.
[32] Nizar Habash,et al. Optimizing Tokenization Choice for Machine Translation across Multiple Target Languages , 2017, Prague Bull. Math. Linguistics.
[33] Nizar Habash,et al. Universal Dependencies for Arabic , 2017, WANLP@EACL.
[34] Ngoc Thang Vu,et al. Challenges of Computational Processing of Code-Switching , 2016, CodeSwitch@EMNLP.
[35] Abdulmohsen Al-Thubaity,et al. Effect of word segmentation on Arabic text classification , 2015, 2015 International Conference on Asian Language Processing (IALP).
[36] Maja Popovic,et al. chrF: character n-gram F-score for automatic MT evaluation , 2015, WMT@EMNLP.
[37] Alexandra Birch,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[38] Mikko Kurimo,et al. Morfessor FlatCat: An HMM-Based Method for Unsupervised and Semi-Supervised Learning of Morphology , 2014, COLING.
[39] Nizar Habash,et al. MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic , 2014, LREC.
[40] Mikko Kurimo,et al. Morfessor 2.0: Toolkit for statistical morphological segmentation , 2014, EACL.
[41] Nizar Habash,et al. Introduction to Arabic Natural Language Processing , 2010, Introduction to Arabic Natural Language Processing.
[42] Christian Monson,et al. EMMA: A novel Evaluation Metric for Morphological Analysis , 2010, COLING.
[43] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[44] Philip Gage,et al. A new algorithm for data compression , 1994 .
[45] L. Baum,et al. Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .
[46] Jamila El Gizuli,et al. Camel Treebank: An Open Multi-genre Arabic Dependency Treebank , 2022, International Conference on Language Resources and Evaluation.
[47] Antonio Toral,et al. Machine Translation for English–Inuktitut with Segmentation, Data Acquisition and Pre-Training , 2020, WMT.
[48] Manish Shrivastava,et al. Enabling Code-Mixed Translation: Parallel Corpus Creation and MT Augmentation Approach , 2018 .
[49] Pushpak Bhattacharyya,et al. Meaningless yet meaningful: Morphology grounded subword-level NMT , 2018 .
[50] Tat-siong Benny Liew,et al. Colonialism and the Bible : Contemporary Reflections from the Global South , 2018 .
[51] Maja Popovic,et al. chrF++: words helping character n-grams , 2017, WMT.
[52] Nizar Habash,et al. Morphological Analysis and Disambiguation for Dialectal Arabic , 2013, NAACL.
[53] Mikko Kurimo,et al. Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline , 2013 .
[54] Nizar Habash,et al. Conventional Orthography for Dialectal Arabic , 2012, LREC.
[55] Nizar Habash,et al. 50th Annual Meeting of the Association for Computational Linguistics Proceedings of the Conference Volume 2: Short Papers , 2012 .
[56] Nizar Habash,et al. On Arabic Transliteration , 2007 .
[57] R. Sinha,et al. Machine Translation of Bi-lingual Hindi-English (Hinglish) Text , 2005, MTSUMMIT.
[58] M. Maamouri,et al. The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus , 2004 .
[59] Treebank Penn,et al. Linguistic Data Consortium , 1999 .