Gestural Interaction for an Automatic Document Capture System

The amount of printed documents used today is still very large despite increased use of digital formats. To bridge the gap between analog paper and digital media, paper documents need to be captured. We present a prototype that allows for cost-effective, fast, and robust document capture using a standard consumer camera. The user’s physical desktop is continuously monitored. Whenever a document is detected, the system acquires its content in one of two ways. Either the entire document is captured or a region of interest is extracted, which the user can specify easily by pointing at it. In both modes a high resolution image is taken and the contained information is digitized. The main challenges in designing and implementing such a capturing system are real-time performance, accurate detection of documents, reliable detection of the user’s hand and robustness against perturbations such as lighting changes and shadows. This paper presents approaches that address these challenges and discusses the integration into a robust document capture system with gestural interaction.

[1]  P. KaewTrakulPong,et al.  An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection , 2002 .

[2]  Christoph H. Lampert,et al.  Oblivious Document Capture and Real-Time Retrieval , 2005 .

[3]  Vladimir Vezhnevets,et al.  A Survey on Pixel-Based Skin Color Detection Techniques , 2003 .

[4]  T. Breuel,et al.  Bibliographic Meta-Data Extraction Using Probabilistic Finite State Transducers , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[5]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 2004, International Journal of Computer Vision.

[6]  Donald R. Johnson,et al.  The Office of the Future. , 1985 .

[7]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[8]  M. Hutchinson The Social Life of Information , 2002 .

[9]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Massimo Piccardi,et al.  Background subtraction techniques: a review , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[11]  Zhengyou Zhang,et al.  Whiteboard scanning and image enhancement , 2007, Digit. Signal Process..

[12]  Christine Reid,et al.  The Myth of the Paperless Office , 2003, J. Documentation.