A Large-Scale Comparison of Historical Text Normalization Systems
暂无分享,去创建一个
[1] Eva Pettersson,et al. Spelling Normalisation and Linguistic Analysis of Historical Text for Information Extraction , 2016 .
[2] Norbert Fuhr,et al. Generating Search Term Variants for Text Collections with Historic Spellings , 2006, ECIR.
[3] Iñaki Alegria,et al. Evaluating the Noisy Channel Model for the Normalization of Historical Texts: Basque, Spanish and Slovene , 2016, LREC.
[4] Hans van Halteren,et al. Dealing with orthographic variation in a tagger-lemmatizer for fourteenth century Dutch charters , 2013, Lang. Resour. Evaluation.
[5] Javier Gómez,et al. Edit transducers for spelling variation in Old Spanish , 2013 .
[6] Anders Søgaard,et al. Improving historical spelling normalization with bi-directional LSTMs and multi-task learning , 2016, COLING.
[7] Mark Steedman,et al. A massively parallel corpus: the Bible in 100 languages , 2014, Lang. Resour. Evaluation.
[8] Rafael Giusti,et al. Automatic detection of spelling variation in historical corpus An application to build a Brazilian Portuguese spelling variants dictionary , 2007 .
[9] Yves Scherrer,et al. Automatic normalisation of the Swiss German ArchiMob corpus using character-level machine translation , 2016, KONVENS.
[10] Wolfram Luther,et al. Comparison of distance measures for historical spelling variants , 2006, IFIP AI.
[11] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.
[12] Walter Daelemans,et al. Lemmatization for variation-rich languages using deep learning , 2016, Digit. Scholarsh. Humanit..
[13] Marilisa Amoia,et al. Using Comparable Collections of Historical Texts for Building a Diachronic Dictionary for Spelling Normalization , 2013, LaTeCH@ACL.
[14] Jörg Tiedemann,et al. An SMT Approach to Automatic Annotation of Historical Text , 2013 .
[15] Paul Rayson,et al. VARD2 : a tool for dealing with spelling variation in historical corpora , 2008 .
[16] Yves Scherrer,et al. Modernising historical Slovene words , 2015, Natural Language Engineering.
[17] Joachim Bingel,et al. Multi-task learning for historical text normalization: Size matters , 2018, DeepLo@ACL.
[18] Marcel Bollmann,et al. (Semi-)Automatic Normalization of Historical Texts using Distance Measures and the Norma tool , 2012 .
[19] Sigrún Helgadóttir,et al. The Tagged Icelandic Corpus (MÍM) , 2012 .
[20] Klaus U. Schulz,et al. Unsupervised Learning of Edit Distance Weights for Retrieving Historical Spelling Variations , 2007 .
[21] Joakim Nivre,et al. Normalisation of Historical Text Using Context-Sensitive Weighted Levenshtein Distance and Compound Splitting , 2013, NODALIDA.
[22] Peter Willett,et al. A Comparison of Spelling-Correction Methods for the Identification of Word Forms in Historical Text Databases , 1993 .
[23] Stoyan Mihov,et al. An approach to unsupervised historical text normalisation , 2014, DATeCH '14.
[24] Walter Daelemans,et al. Weigh your words - memory-based lemmatization for Middle Dutch , 2010, Lit. Linguistic Comput..
[25] Matthias Sperber,et al. XNMT: The eXtensible Neural Machine Translation Toolkit , 2018, AMTA.
[26] M. de Rijke,et al. A Cross-Language Approach to Historic Document Retrieval , 2006, ECIR.
[27] Joachim Bingel,et al. Learning attention for historical text normalization by learning to pronounce , 2017, ACL.
[28] Yves Scherrer,et al. Modernizing historical Slovene words with character-based SMT , 2013, BSNLP@ACL.
[29] Sharon Goldwater,et al. Evaluating Historical Text Normalization Systems: How Well Do They Generalize? , 2018, NAACL.
[30] Joakim Nivre,et al. An Evaluation of Neural Machine Translation Models on Historical Spelling Normalization , 2018, COLING.
[31] Natalia Korchagina. Normalizing Medieval German Texts: from rules to deep learning , 2017, ListLang@NoDaLiDa.
[32] Gerlof Bouma,et al. bokstaffua, bokstaffwa, bokstafwa, bokstaua, bokstawa ... Towards lexical link-up for a corpus of Old Swedish , 2012, KONVENS.
[33] Marcel Bollmann,et al. Normalization of historical texts with neural network models , 2018 .
[34] Francisco Casacuberta Nolla,et al. Spelling Normalization of Historical Documents by Using a Machine Translation Approach , 2018, EAMT.
[35] Joakim Nivre,et al. A Multilingual Evaluation of Three Spelling Normalisation Methods for Historical Text , 2014, LaTeCH@EACL.
[36] André F. T. Martins,et al. Marian: Fast Neural Machine Translation in C++ , 2018, ACL.
[37] Felipe Sánchez-Martínez,et al. An open diachronic corpus of historical Spanish , 2013, Language Resources and Evaluation.
[38] Gerold Schneider,et al. Comparing Rule-based and SMT-based Spelling Normalisation for English Historical Texts , 2017, ListLang@NoDaLiDa.
[39] Bryan Jurish,et al. More than Words: Using Token Context to Improve Canonicalization of Historical German , 2010, J. Lang. Technol. Comput. Linguistics.
[40] Stefanie Dipper,et al. Rule-Based Normalization of Historical Texts , 2011 .
[41] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.
[42] Hans Fix. Automatische Normalisierung - Vorarbeit zur Lemmatisierung eines diplomatischen altisländischen Textes , 1980 .
[43] Rico Sennrich,et al. The University of Edinburgh’s Neural MT Systems for WMT17 , 2017, WMT.
[44] Dawn Archer,et al. VARD versus WORD: A comparison of the UCREL variant detector and modern spellcheckers on English historical corpora , 2005 .
[45] Thomas M. Breuel,et al. Normalizing historical orthography for OCR historical documents using LSTM , 2013, HIP '13.
[46] Martin Porter,et al. Snowball: A language for stemming algorithms , 2001 .
[47] Bryan Jurish,et al. Comparing Canonicalizations of Historical German Text , 2010, SIGMORPHON.
[48] Fabian Barteld,et al. Unsupervised regularization of historical texts for POS tagging , 2016 .
[49] Jörg Tiedemann,et al. Normalizing Early English Letters to Present-day English Spelling , 2018, LaTeCH@COLING.
[50] Tomaž Erjavec,et al. Normalising Slovene data: historical texts vs. user-generated content , 2016, KONVENS.
[51] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .