Augmenting text document by on-line learning of local arrangement of keypoints

We propose a technique for text document tracking over a large range of viewpoints. Since the popular SIFT or SURF descriptors typically fail on such documents, our method considers instead local arrangement of keypoints. We extends Locally Likely Arrangement Hashing (LLAH), which is limited to fronto-parallel images: We handle a large range of viewpoints by learning the behavior of keypoint patterns when the camera viewpoint changes. Our method starts tracking a document from a nearly frontal view. Then, it undergoes motion, and new configurations of keypoints appear. The database is incrementally updated to reflect these new observations, allowing the system to detect the document under the new viewpoint. We demonstrate the performance and robustness of our method by comparing it with the original LLAH.

[1]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[2]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[3]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[4]  Yehezkel Lamdan,et al.  Geometric Hashing: A General And Efficient Model-based Recognition Scheme , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[5]  Berna Erol,et al.  Paper-Based Augmented Reality , 2007, 17th International Conference on Artificial Reality and Telexistence (ICAT 2007).

[6]  Hideo Saito,et al.  AR GIS on a Physical Map Based on Map Image Retrieval Using LLAH Tracking , 2009, MVA.

[7]  Yakup Genc,et al.  GPU-based Video Feature Tracking And Matching , 2006 .

[8]  Masakazu Iwamura,et al.  Camera Based Document Image Retrieval with More Time and Memory Efficient LLAH , 2008 .

[9]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Vincent Lepetit,et al.  Point matching as a classification problem for fast and robust object pose estimation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  Masakazu Iwamura,et al.  Use of Affine Invariants in Locally Likely Arrangement Hashing for Camera-Based Document Image Retrieval , 2006, Document Analysis Systems.

[13]  Myriam Servières,et al.  AR representation system for 3D GIS based on camera pose estimation using distribution of intersections (in Japonese) , 2008 .

[14]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[15]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[16]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[17]  Dieter Schmalstieg,et al.  Pose tracking from natural features on mobile phones , 2008, 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality.

[18]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.