User-directed analysis of scanned images

Digital capture (scanning in all its forms, and digital photography/video recording), in providing virtually free temporary memory of captured information, allows users to "over-gather" information during capture, and then to discard unwanted material later. For cameras and video recorders, such editing largely consists of discarding images or frames in their entirety. For scanners (and high-resolution camera/video), such editing benefits from a preview capability that provides quick and reliable user-interface tools for selecting, filtering and saving specific portions of the input. Appropriate preview user interface (UI) tools ease the accessing, editing and dispatch to desired destination (archive, application, webpage, etc.) of captured information (text, tables, drawings, photos, etc.). In this paper, we present several different means for the user-directed "rapid capture" of portions of a scanned image. Specifically, we review past, present and future preview-based UI tools that allow efficient and accurate means of capture to the user. The bases of these tools, as described herein, are user-directed zoning analysis, known as "click and select", which incorporates a bottom-up zoning analysis engine; and statistics-based region classification, which allows rapid reconfiguration of region identification and clustering. We conclude with our view of the future of UI-directed capture.

[1]  Josef Kittler,et al.  Minimum error thresholding , 1986, Pattern Recognit..

[2]  Josef Kittler,et al.  Threshold selection based on a simple image statistic , 1985, Comput. Vis. Graph. Image Process..

[3]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[4]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Michael H. F. Wilkinson,et al.  Optimizing Edge Detectors for Robust Automatic Threshold Selection: Coping with Edge Curvature and Noise , 1998, Graph. Model. Image Process..

[6]  Rolf Ingold,et al.  Optical Font Recognition from Projection Profiles , 1993, Electron. Publ..

[7]  Friedrich M. Wahl,et al.  Block segmentation and text extraction in mixed text/image documents , 1982, Comput. Graph. Image Process..

[8]  Nabih N. Abdelmalek,et al.  Maximum likelihood thresholding based on population mixture models , 1992, Pattern Recognit..