Learning to Weight Color and Depth for RGB-D Visual Search

Both color and depth information may be deployed to seek by content through RGB-D imagery. Previous works dealing with global descriptors for RGB-D images advocate a decision level fusion whereby independently computed color and depth representations are juxtaposed to pursue similarity search. Differently, in this paper we propose a learning-to-rank paradigm aimed at weighting the two information channels according to the specific traits of the task and data at hand, thereby effortlessly addressing the potential diversity across applications. In particular, we propose a novel method, referred to as kNN-rank, which can learn the regularities among the outputs yielded by similarity-based queries. A further novel contribution of this paper concerns the HyperRGBD framework, a set of tools conceived to enable seamless aggregation of existing RGB-D datasets in order to obtain new data featuring desired peculiarities and cardinality.

[1]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[2]  Shaobin Huang,et al.  Selective Feature Combination and Automatic Shape Categorization of 3D Models , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[3]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[4]  Luigi di Stefano,et al.  RGB-D Visual Search with Compact Binary Codes , 2015, 2015 International Conference on 3D Vision.

[5]  Luigi di Stefano,et al.  Analysis of Compact Features for RGB-D Visual Search , 2015, ICIAP.

[6]  Rongrong Ji,et al.  On-Device Mobile Landmark Recognition Using Binarized Descriptor with Multifeature Fusion , 2015, ACM Trans. Intell. Syst. Technol..

[7]  Shih-Fu Chang,et al.  Mobile product search with Bag of Hash Bits and boundary reranking , 2012, CVPR.

[8]  Dan Levi,et al.  Fusing visual and range imaging for object class recognition , 2011, 2011 International Conference on Computer Vision.

[9]  Rongrong Ji,et al.  Top Rank Supervised Binary Coding for Visual Search , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Florent Perronnin,et al.  Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Martin A. Riedmiller,et al.  A learned feature descriptor for object recognition in RGB-D data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[12]  Ricardo da Silva Torres,et al.  Learning to rank for content-based image retrieval , 2010, MIR '10.

[13]  Andrew Y. Ng,et al.  Convolutional-Recursive Deep Learning for 3D Object Classification , 2012, NIPS.

[14]  Stan Z. Li,et al.  Learning to Fuse 3D+2D Based Face Recognition at Both Feature and Decision Levels , 2005, AMFG.

[15]  Jie Lin,et al.  Compact Global Descriptors for Visual Search , 2015, 2015 Data Compression Conference.

[16]  Hong Liu,et al.  A comprehensive study on learning to rank for content-based image retrieval , 2013, Signal Process..

[17]  Xin Zhao,et al.  Query Adaptive Similarity Measure for RGB-D Object Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Heinrich H. Bülthoff,et al.  Going into depth: Evaluating 2D and 3D cues for object classification on a new, large-scale object dataset , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).