PerfectDoc: a ground truthing environment for complex documents
暂无分享,去创建一个
In this paper, we present PerfectDoc; a ground truthing and document correction tool. The tool provides post processing correction capabilities that are required after complex document analysis and understanding tasks. The tool has the advantage of being comprehensive (integration of most common correction tasks), easy to use (minimal clicks for corrections), configurable (can be used for different types of documents), and provides separate correction views. We used the tool to correct the output from a document understanding system used to extract articles from 80-years archive of Time weekly magazine.
[1] Luc Vincent,et al. Pink Panther: A Complete Environment For Ground-Truthing And Benchmarking Document Page Segmentation , 1998, Pattern Recognit..
[2] Song Mao,et al. PSET: A Page Segmentation Evaluation Toolkit , 2006 .
[3] Tapas Kanungo,et al. TRUEVIZ: a groundtruth/metadata editing and visualizing toolkit for OCR , 2000, IS&T/SPIE Electronic Imaging.
[4] Donato Malerba,et al. Transforming paper documents into XML format with WISDOM++ , 2001, International Journal on Document Analysis and Recognition.