WangchanBERTa: Pretraining transformer-based Thai Language Models
暂无分享,去创建一个
Sarana Nutanong | Lalita Lowphansirikul | Charin Polpanumas | Nawat Jantrakulchai | Sarana Nutanong | Charin Polpanumas | Lalita Lowphansirikul | Nawat Jantrakulchai
[1] Omer Levy,et al. SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.
[2] Samuel R. Bowman,et al. CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models , 2020, EMNLP.
[3] Jörg Tiedemann,et al. OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.
[4] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[5] Pattarawat Chormai,et al. Syllable-based Neural Thai Word Segmentation , 2020, COLING.
[6] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[7] Colin Raffel,et al. Extracting Training Data from Large Language Models , 2020, USENIX Security Symposium.
[8] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[9] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[10] Omer Levy,et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.
[11] Laurent Romary,et al. CamemBERT: a Tasty French Language Model , 2019, ACL.
[12] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[13] Siva Reddy,et al. StereoSet: Measuring stereotypical bias in pretrained language models , 2020, ACL.
[14] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[15] Sarana Nutanong,et al. Domain Adaptation of Thai Word Segmentation Models Using Stacked Ensemble , 2020, EMNLP.
[16] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[18] Hitoshi Isahara,et al. ORCHID: Thai Part-Of-Speech Tagged Corpus , 2009 .
[19] Lav R. Varshney,et al. CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.
[20] Ahmed Abdelali,et al. The AMARA Corpus: Building Parallel Language Resources for the Educational Domain , 2014, LREC.
[21] Christopher D. Manning,et al. Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.
[22] Jörg Tiedemann,et al. Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.
[23] Wirote Aroonmanakun,et al. Thai National Corpus: A Progress Report , 2009, ALR7@IJCNLP.
[24] Nanyun Peng,et al. The Woman Worked as a Babysitter: On Biases in Language Generation , 2019, EMNLP.
[25] Jianfeng Gao,et al. DeBERTa: Decoding-enhanced BERT with Disentangled Attention , 2020, ICLR.
[26] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[27] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[28] Ivan Vulic,et al. Unsupervised Cross-Lingual Representation Learning , 2019, ACL.