Massive vs. Curated Embeddings for Low-Resourced Languages: the Case of Yorùbá and Twi
暂无分享,去创建一个
David Ifeoluwa Adelani | Jesujoba O. Alabi | Cristina Espana-Bonet | David I. Adelani | Kwabena Amponsah-Kaakyire | C. España-Bonet | Kwabena Amponsah-Kaakyire | Jesujoba Oluwadara Alabi
[1] Guillaume Lample,et al. Word Translation Without Parallel Data , 2017, ICLR.
[2] Rada Mihalcea,et al. Cross-lingual Semantic Relatedness Using Encyclopedic Knowledge , 2009, EMNLP.
[3] Roi Reichart,et al. Judgment Language Matters: Multilingual Vector Space Models for Judgment Language Aware Lexical Semantics , 2015, ArXiv.
[4] Zhiyuan Liu,et al. Joint Learning of Character and Word Embeddings , 2015, IJCAI.
[5] Suman K. Mitra,et al. Word Embeddings in Low Resource Gujarati Language , 2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).
[6] Prakhar Gupta,et al. Learning Word Vectors for 157 Languages , 2018, LREC.
[7] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[8] Jaime G. Carbonell,et al. Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations , 2018, EMNLP.
[9] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[10] Holger Schwenk,et al. Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond , 2018, Transactions of the Association for Computational Linguistics.
[11] Zeljko Agic,et al. JW300: A Wide-Coverage Parallel Corpus for Low-Resource Languages , 2019, ACL.
[12] Ehud Rivlin,et al. Placing search in context: the concept revisited , 2002, TOIS.
[13] E. Osam,et al. AN INTRODUCTION TO THE VERBAL AND MULTI-VERBAL SYSTEM OF AKAN , 2003 .
[14] Diana Inkpen,et al. Comparison of Semantic Similarity for Different Languages Using the Google n-gram Corpus and Second-Order Co-occurrence Measures , 2011, Canadian Conference on AI.
[15] Nigel Collier,et al. SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity , 2017, *SEMEVAL.
[16] Tunde Adegbola. Pattern-based Unsupervised Induction Of Yorùbá Morphology , 2016, WWW.
[17] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[18] E. Beisner,et al. Jehovah's Witnesses , 1995 .
[19] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[20] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[21] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[22] Philip Resnik,et al. The Bible as a Parallel Corpus: Annotating the ‘Book of 2000 Tongues’ , 1999, Comput. Humanit..
[23] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[24] Cho-Jui Hsieh,et al. Learning Word Embeddings for Low-Resource Languages by PU Learning , 2018, NAACL-HLT.
[25] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[26] Laurent Romary,et al. CamemBERT: a Tasty French Language Model , 2019, ACL.
[27] Tunde Adegbola,et al. Quantifying the effect of corpus size on the quality of automatic diacritization of Yorùbá texts , 2012, SLTU.
[28] E. R. Adagunodo,et al. Restoring tone-Marks in Standard YorùBá Electronic Text: Improved Model , 2017, Comput. Sci..
[29] Iroro Orife. Attentive Sequence-to-Sequence Learning for Diacritic Restoration of Yorùbá Language Text , 2018, INTERSPEECH.