Building a Japanese Typo Dataset from Wikipedia's Revision History
暂无分享,去创建一个
[1] Idan Szpektor,et al. DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion , 2019, NAACL.
[2] Manabu Okumura,et al. A Simple Approach to Unknown Word Processing in Japanese Morphological Analysis , 2013, IJCNLP.
[3] Yonatan Belinkov,et al. Synthetic and Natural Noise Both Break Neural Machine Translation , 2017, ICLR.
[4] Torsten Zesch,et al. Measuring Contextual Fitness Using Error Contexts Extracted from the Wikipedia Revision History , 2012, EACL.
[5] Chris Callison-Burch,et al. Optimizing Statistical Machine Translation for Text Simplification , 2016, TACL.
[6] Shuly Wintner,et al. Native Language Identification with User Generated Content , 2018, EMNLP.
[7] Yuji Matsumoto,et al. Improving Neural Text Normalization with Data Augmentation at Character- and Morphological Levels , 2017, IJCNLP.
[8] Yuji Matsumoto,et al. Mining Revision Log of Language Learning SNS for Automated Japanese Error Correction of Second Language Learners , 2011, IJCNLP.
[9] Yukino Baba,et al. How Are Spelling Errors Generated and Corrected? A Study of Corrected and Uncorrected Spelling Errors Using Keystroke Logs , 2012, ACL.
[10] Brendan T. O'Connor,et al. Demographic Dialectal Variation in Social Media: A Case Study of African-American English , 2016, EMNLP.
[11] Guillaume Wisniewski,et al. Mining Naturally-occurring Corrections and Paraphrases from Wikipedia’s Revision History , 2022, LREC.
[12] Masato Hagiwara,et al. GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors , 2019, LREC.
[13] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.
[14] Yuji Matsumoto,et al. Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.
[15] Yuen-Hsien Tseng,et al. Building a TOCFL Learner Corpus for Chinese Grammatical Error Diagnosis , 2018, LREC.
[16] Nicholas Diakopoulos,et al. Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs , 2011, EMNLP.
[17] Daisuke Kawahara,et al. Juman++: A Morphological Analysis Toolkit for Scriptio Continua , 2018, EMNLP.
[18] Kevin Duh,et al. Robsut Wrod Reocginiton via Semi-Character Recurrent Neural Network , 2016, AAAI.