Jamo Pair Encoding: Subcharacter Representation-based Extreme Korean Vocabulary Compression for Efficient Subword Tokenization
暂无分享,去创建一个
[1] Oriol Vinyals,et al. Multilingual Language Processing From Bytes , 2015, NAACL.
[2] Chao Liu,et al. Radical Embedding: Delving Deeper to Chinese Radicals , 2015, ACL.
[3] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[4] Rui Li,et al. Multi-Granularity Chinese Word Embedding , 2016, EMNLP.
[5] Alice H. Oh,et al. Subword-level Word Vector Representations for Korean , 2018, ACL.
[6] Nam Soo Kim,et al. Investigating an Effective Character-level Embedding in Korean Sentence Classification , 2019, ArXiv.
[7] Percy Liang,et al. Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.
[8] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[9] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[10] Karl Stratos. A Sub-Character Architecture for Korean Language Processing , 2017, EMNLP.
[11] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[12] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[13] Mirella Lapata,et al. 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers , 2016, ACL 2016.
[14] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[15] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[16] Nan Yang,et al. Radical-Enhanced Chinese Character Embedding , 2014, ICONIP.
[17] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[18] Alistair A. Young,et al. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2017, MICCAI 2017.
[19] Lei Wu,et al. Dual Long Short-Term Memory Networks for Sub-Character Representation Learning , 2017, ArXiv.
[20] Jaime G. Carbonell,et al. Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations , 2018, EMNLP.
[21] Masao Utiyama,et al. Simplified Abugidas , 2018, ACL.