Revealing What to Extract from Where, for Object-Centric Content Based Image Retrieval (CBIR)

Content Based Image Retrieval (CBIR) techniques retrieve similar digital images from a large database. As the user often does not provide any clue (indication) of the region of interest in a query image, most methods of CBIR rely on a representation of the global content of the image. The desired content in an image is often localized (e.g. car appearing salient in a street) instead of being holistic, demanding the need for an object-centric CBIR. We propose a biologically inspired framework WOW ("What" Object is "Where") for this purpose. Design of WOW framework is motivated by the cognitive model of human visual perception and feature integration theory (FIT). The key contributions in the proposed approach are: (i) Feedback mechanism between Recognition ("What") and Localization ("Where") modules (both supervised), for a cohesive decision based on mutual consensus; (ii) Hierarchy of visual features (based on FIT) for an efficient recognition task. Integration of information from the two channels ("What" and "Where") in an iterative feedback mechanism, helps to filter erroneous contents in the outputs of individual modules. Finally, using a similarity criteria based on HOG features (spatially localized by WOW) for matching, our system effectively retrieves a set of rank-ordered samples from the gallery. Experimentation done on various real-life datasets (including PASCAL) exhibits the superior performance of the proposed method.

[1]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[2]  Bernard J. Baars,et al.  Consciousness and attention , 2010 .

[3]  Yang Wang,et al.  Kernel Latent SVM for Visual Recognition , 2012, NIPS.

[4]  Lei Zhang,et al.  Image retrieval based on micro-structure descriptor , 2011, Pattern Recognit..

[5]  Gareth Funka-Lea,et al.  Graph Cuts and Efficient N-D Image Segmentation , 2006, International Journal of Computer Vision.

[6]  Urbano Nunes,et al.  Trainable classifier-fusion schemes: An application to pedestrian detection , 2009, 2009 12th International IEEE Conference on Intelligent Transportation Systems.

[7]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[8]  Feiping Nie,et al.  Heterogeneous Visual Features Fusion via Sparse Multimodal Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  ZissermanAndrew,et al.  The Pascal Visual Object Classes Challenge , 2015 .

[10]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[11]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[12]  C. A. Desoer,et al.  Nonlinear Systems Analysis , 1978 .

[13]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Tieniu Tan,et al.  Boosted local structured HOG-LBP for object localization , 2011, CVPR 2011.

[15]  Sukhendu Das,et al.  SLAR (Simultaneous Localization And Recognition) Framework for Smart CBIR , 2012, PerMIn.

[16]  Rong-Tai Chen,et al.  A smart content-based image retrieval system based on color and texture feature , 2009, Image Vis. Comput..

[17]  Jing-Yu Yang,et al.  Content-based image retrieval using color difference histogram , 2013, Pattern Recognit..

[18]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  B. Baars,et al.  Cognition, Brain, and Consciousness: Introduction to Cognitive Neuroscience , 2007 .

[20]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[21]  Honggang Zhang,et al.  Re-ranking using compression-based distance measure for Content-based Commercial Product Image Retrieval , 2012, 2012 19th IEEE International Conference on Image Processing.

[22]  Louis K. H. Chan,et al.  Feature integration theory revisited: dissociating feature detection and attentional guidance in visual search. , 2009, Journal of experimental psychology. Human perception and performance.

[23]  Masao Fukuhara On Feature Binding , 2016 .

[24]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[25]  Lei Zhang,et al.  Contents lists available at ScienceDirect Pattern Recognition , 2022 .

[26]  Gustavo Deco,et al.  Computational neuroscience of vision , 2002 .

[27]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[28]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Yiquan Wu,et al.  Shape-Based Image Retrieval Using Combining Global and Local Shape Features , 2009, 2009 2nd International Congress on Image and Signal Processing.

[30]  Vincent Di Lollo The feature-binding problem is an ill-posed problem. , 2012, Trends in cognitive sciences.

[31]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[32]  Hui Zhang,et al.  Localized Content-Based Image Retrieval , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Semir Zeki,et al.  Feature binding in the feedback layers of area V2. , 2009, Cerebral cortex.

[34]  A Treisman,et al.  Feature binding, attention and object perception. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[35]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Derek Hoiem,et al.  Category-Independent Object Proposals with Diverse Ranking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.