Overview of the 2017 ALTA Shared Task: Correcting OCR Errors
暂无分享,去创建一个
This paper presents an overview of the 8th ALTA shared task that ran in 2017. The task was to correct OCR errors from scans of newspapers stored in the Trove database maintained by the National Library of Australia. We introduce the task, describe the data and present the results of the participating teams.
[1] Rose Holley. Trove: Innovation in Access to Information in Australia , 2010 .
[2] Alexander Mehler,et al. A Comparison of Four Character-Level String-to-String Translation Models for (OCR) Spelling Error Correction , 2016, Prague Bull. Math. Linguistics.
[3] Andy Way,et al. Using SMT for OCR Error Correction of Historical Texts , 2016, LREC.
[4] Steve Cassidy. Publishing the Trove Newspaper Corpus , 2016, LREC.