Document image dewarping using robust estimation of curled text lines

Digital cameras have become almost ubiquitous and their use for fast and casual capturing of natural images is unchallenged. For making images of documents, however, they have not caught up to flatbed scanners yet, mainly because camera images tend to suffer from distortion due to the perspective and are therefore limited in their further use for archival or OCR. For images of non-planar paper surfaces like books, page curl causes additional distortion, which poses an even greater problem due to its nonlinearity. This paper presents a new algorithm for removing both perspective and page curl distortion. It requires only a single camera image as input and relies on a priori layout information instead of additional hardware. Therefore, it is much more user friendly than most previous approaches, and allows for flexible ad hoc document capture. Results are presented showing that the algorithm produces visually pleasing output and increases OCR accuracy, thus having the potential to become a general purpose preprocessing tool for camera based document capture.

[1]  M. Pilu Deskewing Perspectively Distorted Documents : An Approach Based on Perceptual Organization , 2001 .

[2]  W. Brent Seales,et al.  Document restoration using 3D shape: a general deskewing algorithm for arbitrarily warped documents , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[3]  Thomas M. Breuel Robust least-square-baseline finding using a branch and bound algorithm , 2001, IS&T/SPIE Electronic Imaging.

[4]  Gady Agam,et al.  Document Image De-warping for Text/Graphics Recognition , 2002, SSPR/SPR.

[5]  Changsong Liu,et al.  Rectifying the bound document image captured by the camera: a model based approach , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[6]  Christoph H. Lampert,et al.  Document capture using stereo vision , 2004, DocEng '04.

[7]  Chew Lim Tan,et al.  Restoration of curved document images through 3D shape modeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  Atsushi Yamashita,et al.  Shape reconstruction and image restoration for non-flat surfaces of documents with a stereo vision system , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..