论文信息 - Object Reading: Text Recognition for Object Recognition

Object Reading: Text Recognition for Object Recognition

We propose to use text recognition to aid in visual object class recognition. To this end we first propose a new algorithm for text detection in natural images. The proposed text detection is based on saliency cues and a context fusion step. The algorithm does not need any parameter tuning and can deal with varying imaging conditions. We evaluate three different tasks: 1. Scene text recognition, where we increase the state-of-the-art by 0.17 on the ICDAR 2003 dataset. 2. Saliency based object recognition, where we outperform other state-of-the-art saliency methods for object recognition on the PASCAL VOC 2011 dataset. 3. Object recognition with the aid of recognized text, where we are the first to report multi-modal results on the IMET set. Results show that text helps for object class recognition if the text is not uniquely coupled to individual object instances.

[1] Mei-Chen Yeh,et al. Multimodal fusion using learned text concepts for image categorization , 2006, MM '06.

[2] Joost van de Weijer,et al. Boosting color saliency in image feature detection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4] Nanning Zheng,et al. Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Nicu Sebe,et al. Image saliency by isocentric curvedness and color , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6] S. Süsstrunk,et al. Frequency-tuned salient region detection , 2009, CVPR 2009.

[7] Frédo Durand,et al. Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8] Hovav Shacham,et al. OpenScan: A Fully Transparent Optical Scan Voting System , 2010, EVT/WOTE.

[9] Yonatan Wexler,et al. Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10] Radim Sára,et al. A Weak Structure Model for Regular Pattern Recognition Applied to Facade Images , 2010, ACCV.

[11] Jing Zhang,et al. Text Detection Using Edge Gradient and Graph Spectrum , 2010, 2010 20th International Conference on Pattern Recognition.

[12] Jiri Matas,et al. A Method for Text Localization and Recognition in Real-World Images , 2010, ACCV.

[13] Kai Wang,et al. End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[14] Alain Trémeau,et al. Detecting Text in Natural Scenes Based on a Reduction of Photometric Effects: Problem of Color Invariance , 2011, CCIW.

[15] Jiri Matas,et al. Text Localization in Real-World Images Using Efficiently Pruned Exhaustive Search , 2011, 2011 International Conference on Document Analysis and Recognition.

[16] Jan C. van Gemert,et al. Exploiting photographic style for category-level image classification by generalizing the spatial pyramid , 2011, ICMR.

[17] Hsueh-Cheng Wang,et al. The Attraction of Visual Attention to Texts in Real-World Scenes: Are Chinese Texts Attractive to Non-Chinese Speakers? , 2011, CogSci.

[18] Yaokai Feng,et al. A Keypoint-Based Approach toward Scenery Character Detection , 2011, 2011 International Conference on Document Analysis and Recognition.

[19] Andreas Dengel,et al. How Salient is Scene Text? , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.