Keep, Change or Delete? Setting up a Low Resource OCR Post-correction Framework for a Digitized Old Finnish Newspaper Collection
暂无分享,去创建一个
[1] Rose Holley,et al. How Good Can It Get? Analysing and Improving OCR Accuracy in Large Scale Historic Newspaper Digitisation Programs , 2009, D Lib Mag..
[2] Fachgebiet Wissensbasierte. Unsupervised Post-Correction of OCR Errors , 2010 .
[3] Daniel P. Lopresti,et al. Optical character recognition errors and their effects on natural language processing , 2008, AND '08.
[4] Timo Honkela,et al. Analyzing and Improving the Quality of a Historical News Collection using Language Technology and Statistical Machine Learning Methods , 2014 .
[5] Kazem Taghva,et al. Evaluation of model-based retrieval effectiveness with OCR text , 1996, TOIS.
[6] Simon Tanner,et al. Measuring Mass Text Digitization Quality and Usefulness , 2009 .
[7] Karen Kukich,et al. Techniques for automatically correcting words in text , 1992, CSUR.
[8] Majlis Bremer-Laamanen. Connecting to the past: Newspaper digitization in the Nordic countries , 2006 .
[9] Kimmo Kettunen,et al. How to do lexical quality estimation of a large OCRed historical Finnish newspaper collection with scarce resources , 2016, Digital Studies/Le champ numérique.
[10] Simon Tanner,et al. Measuring Mass Text Digitization Quality and Usefulness: Lessons Learned from Assessing the OCR Accuracy of the British Library's 19th Century Online Newspaper Archive , 2009, D Lib Mag..
[11] Otto Chrons,et al. Digitalkoot: Making Old Archives Accessible Using Crowdsourcing , 2011, Human Computation.
[12] Edwin Klijn. The Current State-of-art in Newspaper Digitization: A Market Perspective , 2008, D Lib Mag..
[13] Hartmut Walravens. A NORDIC DIGITAL NEWSPAPER LIBRARY , 2006 .
[14] Martin Volk,et al. Reducing OCR Errors in Gothic-Script Documents , 2011, ERCIM News.
[15] R. Segal,et al. A Market Perspective , 2003 .