Interactive visual object search through mutual information maximization

Searching for small objects (e.g., logos) in images is a critical yet challenging problem. It becomes more difficult when target objects differ significantly from the query object due to changes in scale, viewpoint or style, not to mention partial occlusion or cluttered backgrounds. With the goal to retrieve and accurately locate the small object in the images, we formulate the object search as the problem of finding subimages with the largest mutual information toward the query object. Each image is characterized by a collection of local features. Instead of only using the query object for matching, we propose a discriminative matching using both positive and negative queries to obtain the mutual information score. The user can verify the retrieved subimages and improve the search results incrementally. Our experiments on a challenging logo database of 10,000 images highlight the effectiveness of this approach.

[1]  Christoph H. Lampert Detecting objects in large image collections and videos by efficient subimage retrieval , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[3]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  Ying Wu,et al.  Discriminative subvolume search for efficient action detection , 2009, CVPR.

[6]  Xing Xie,et al.  Spatial pyramid mining for logo detection in natural scenes , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[7]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Gang Hua,et al.  Building contextual visual vocabulary for large-scale image applications , 2010, ACM Multimedia.

[9]  Nozha Boujemaa,et al.  Interactive objects retrieval with efficient boosting , 2009, MM '09.

[10]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[12]  Olivier Buisson,et al.  Logo retrieval with a contrario visual query expansion , 2009, ACM Multimedia.