Unsupervised profiling of OCRed historical documents
暂无分享,去创建一个
[1] Hildelies Balk,et al. IMPACT: centre of competence in text digitisation , 2011, HIP '11.
[2] George Nagy,et al. Optical character recognition: an illustrated guide to the frontier , 1999, Electronic Imaging.
[3] Alicia Fornés,et al. Transcription alignment of Latin manuscripts using hidden Markov models , 2011, HIP '11.
[4] Lon-Mu Liu,et al. Adaptive post-processing of OCR text via knowledge acquisition , 1991, CSC '91.
[5] Kenneth Ward Church,et al. Probability scoring for spelling correction , 1991 .
[6] Xiang Tong,et al. A Statistical Approach to Automatic OCR Error Correction in Context , 1996, VLC@COLING.
[7] Norbert Fuhr,et al. Generating Search Term Variants for Text Collections with Historic Spellings , 2006, ECIR.
[8] Ray Smith. Limits on the Application of Frequency-Based Language Models to OCR , 2011, 2011 International Conference on Document Analysis and Recognition.
[9] Apostolos Antonacopoulos,et al. Grid-based modelling and correction of arbitrarily warped historical document images for large-scale digitisation , 2011, HIP '11.
[10] Rong Jin,et al. Information retrieval for OCR documents: a content-based probabilistic correction model , 2003, IS&T/SPIE Electronic Imaging.
[11] Achim Weigel,et al. Lexical postprocessing by heuristic search and automatic determination of the edit costs , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.
[12] Klaus U. Schulz,et al. Deriving Symbol Dependent Edit Weights for Text Correction_The Use of Error Dictionaries , 2007 .
[13] Peter N. Yianilos,et al. Learning String-Edit Distance , 1996, IEEE Trans. Pattern Anal. Mach. Intell..
[14] William J. Byrne,et al. A Generative Probabilistic OCR Model for NLP Applications , 2003, NAACL.
[15] Bryan Jurish,et al. More than Words: Using Token Context to Improve Canonicalization of Historical German , 2010, J. Lang. Technol. Comput. Linguistics.
[16] Klaus U. Schulz,et al. On lexical resources for digitization of historical documents , 2009, DocEng '09.
[17] Michael J. Fischer,et al. The String-to-String Correction Problem , 1974, JACM.
[18] Klaus U. Schulz,et al. Towards information retrieval on historical document collections: the role of matching procedures and special lexica , 2010, International Journal on Document Analysis and Recognition (IJDAR).
[19] Ulrich Reffle. Efficiently generating correction suggestions for garbled tokens of historical language , 2011, Nat. Lang. Eng..
[20] Sarah M. Greene,et al. More than words: patients' views on apology and disclosure when things go wrong in cancer care. , 2013, Patient education and counseling.
[21] Klaus U. Schulz,et al. Fast string correction with Levenshtein automata , 2002, International Journal on Document Analysis and Recognition.
[22] Eric Brill,et al. An Improved Error Model for Noisy Channel Spelling Correction , 2000, ACL.
[23] Ashok C. Popat. A panlingual anomalous text detector , 2009, DocEng '09.
[24] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[25] Kazem Taghva,et al. Information access in the presence of OCR errors , 2004, HDP '04.
[26] Thomas L. Packer. Performing information extraction to improve OCR error detection in semi-structured historical documents , 2011, HIP '11.