Document Image Dewarping Contest

Dewarping of documents captured with hand-held cameras in an uncontrolled environment has triggered a lot of interest in the scientific community over the last few years and many approaches have been proposed. However, there has been no comparative evaluation of different dewarping techniques so far. In an attempt to fill this gap, we have organized a page dewarping contest along with CBDAR 2007. We have created a dataset of 102 documents captured with a hand-held camera and have made it freely available online. We have prepared text-line, text-zone, and ASCII text ground-truth for the documents in this dataset. Three groups participated in the contest with their methods. In this paper we present an overview of the approaches that the participants used, the evaluation measure, and the dataset used in the contest. We report the performance of all participating methods. The evaluation shows that none of the participating methods was statistically significantly better than any other participating method.

[1]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[2]  W. Brent Seales,et al.  Document restoration using 3D shape: a general deskewing algorithm for arbitrarily warped documents , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[3]  Christoph H. Lampert,et al.  Document capture using stereo vision , 2004, DocEng '04.

[4]  Ioannis Pratikakis,et al.  Segmentation Based Recovery of Arbitrarily Warped Document Images , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[5]  William M. Newman,et al.  Documents through cameras , 1999, Image Vis. Comput..

[6]  Shijian Lu,et al.  Document Flattening through Grid Modeling and Regularization , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[7]  Atsushi Yamashita,et al.  Shape reconstruction and image restoration for non-flat surfaces of documents with a stereo vision system , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[8]  Shijian Lu,et al.  The Restoration of Camera Documents Through Image Segmentation , 2006, Document Analysis Systems.

[9]  Thomas M. Breuel,et al.  Pixel-Accurate Representation and Evaluation of Page Segmentation in Document Images , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[10]  Wenxin Li,et al.  A Model-based Book Dewarping Method Using Text Line Detection , 2007 .

[11]  Thomas M. Breuel,et al.  Performance Comparison of Six Algorithms for Page Segmentation , 2006, Document Analysis Systems.

[12]  David S. Doermann,et al.  Flattening curved documents in images , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[14]  Chew Lim Tan,et al.  Warped image restoration with applications to digital libraries , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[15]  L. M. Mestetskiy,et al.  Usage of continuous skeletal image representation for document images de-warping , 2007 .

[16]  Ching Y. Suen,et al.  Historical review of OCR research and development , 1992, Proc. IEEE.

[17]  Konstantinos Ntirogiannis,et al.  Restoration of arbitrarily warped document images based on text line and word detection , 2007 .

[18]  Thomas M. Breuel,et al.  The Future of Document Imaging in the Era of Electronic Documents , 2004 .

[19]  M. Pilu Deskewing Perspectively Distorted Documents : An Approach Based on Perceptual Organization , 2001 .

[20]  Christoph H. Lampert,et al.  Document image dewarping using robust estimation of curled text lines , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).