Visual Text Features for Image Matching

We present a new class of visual text features that are based on text in camera phone images. A robust text detection algorithm locates individual text lines and feeds them to a recognition engine. From the recognized characters, we generate the visual text features in a way that resembles image features. We calculate their location, scale, orientation, and a descriptor that describes the character and word information. We apply visual text features to image matching. To disambiguate false matches, we developed a word-distance matching method. Our experiments with image that contain text show that the new visual text feature based image matching pipeline performs on par or better than a conventional image feature based pipeline while requiring less than 10 bits per feature. This is 4.5× smaller than state-of-the-art visual feature descriptors.

[1]  Berna Erol,et al.  HOTPAPER: multimedia interaction with paper using mobile phones , 2008, ACM Multimedia.

[2]  Huizhong Chen,et al.  Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions , 2011, 2011 18th IEEE International Conference on Image Processing.

[3]  Kai Wang,et al.  Word Spotting in the Wild , 2010, ECCV.

[4]  Boris Katz,et al.  Searching documentation using text, OCR, and image , 2009, SIGIR.

[5]  Huizhong Chen,et al.  Mobile visual search using image and text features , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[6]  Huizhong Chen,et al.  Combining image and text features: a hybrid approach to mobile book spine recognition , 2011, ACM Multimedia.

[7]  Xin Chen,et al.  City-scale landmark identification on mobile devices , 2011, CVPR 2011.

[8]  Dan S. Bloomberg,et al.  Discrete Point Based Signatures and Applications to Document Matching , 2011, ICIAP.

[9]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[10]  Jonathan J. Hull,et al.  Toward Massive Scalability in Image Matching , 2010, 2010 20th International Conference on Pattern Recognition.

[11]  Berna Erol,et al.  Paper-Based Augmented Reality , 2007 .

[12]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  R. Lewand Cryptological Mathematics , 2000 .

[14]  Bernd Girod,et al.  Location coding for mobile image retrieval , 2009, MobiMedia.

[15]  Masakazu Iwamura,et al.  Use of Affine Invariants in Locally Likely Arrangement Hashing for Camera-Based Document Image Retrieval , 2006, Document Analysis Systems.

[16]  Shijian Lu,et al.  Document Image Retrieval through Word Shape Coding , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Bernd Girod,et al.  Mobile product recognition , 2010, ACM Multimedia.

[18]  Simon M. Lucas,et al.  ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[19]  Eckehard G. Steinbach,et al.  Exploiting Text-Related Features for Content-based Image Retrieval , 2011, 2011 IEEE International Symposium on Multimedia.

[20]  Bernd Girod,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, CVPR.