RGB-D Visual Search with Compact Binary Codes

As integration of depth sensing into mobile devices is likely forthcoming, we investigate on merging appearance and shape information for mobile visual search. Accordingly, we propose an RGB-D search engine architecture that can attain high recognition rates with peculiarly moderate bandwidth requirements. Our experiments include a comparison to the CDVS (Compact Descriptors for Visual Search) pipeline, candidate to become part of the MPEG-7 standard, and contribute to elucidate on the merits and limitations of joint deployment of depth and color in mobile visual search.

[1]  Sam S. Tsai,et al.  Survey of SIFT Compression Schemes , 2010 .

[2]  Andrew Y. Ng,et al.  Convolutional-Recursive Deep Learning for 3D Object Classification , 2012, NIPS.

[3]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[4]  Gianluca Francini,et al.  Statistical modelling of outliers for fast visual search , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[5]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[6]  Wen Gao,et al.  Location Discriminative Vocabulary Coding for Mobile Landmark Search , 2011, International Journal of Computer Vision.

[7]  Shree K. Nayar,et al.  PiCam , 2013, ACM Trans. Graph..

[8]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[9]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[10]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[11]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Kuan-Ting Yu,et al.  Learning hierarchical representation with sparsity for RGB-D object recognition , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Vincent Lepetit,et al.  Boosting Binary Keypoint Descriptors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Dieter Fox,et al.  Kernel Descriptors for Visual Recognition , 2010, NIPS.

[16]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[17]  Wolfram Burgard,et al.  3-D Mapping With an RGB-D Camera , 2014, IEEE Transactions on Robotics.

[18]  Pieter Abbeel,et al.  BigBIRD: A large-scale 3D database of object instances , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[20]  Cristian Sminchisescu,et al.  Efficient Match Kernel between Sets of Features for Visual Recognition , 2009, NIPS.

[21]  Bernd Girod,et al.  Tree Histogram Coding for Mobile Image Matching , 2009, 2009 Data Compression Conference.

[22]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Dieter Fox,et al.  Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Martin A. Riedmiller,et al.  A learned feature descriptor for object recognition in RGB-D data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[25]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[26]  Shih-Fu Chang,et al.  Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[28]  Bernd Girod,et al.  Compressed Histogram of Gradients: A Low-Bitrate Descriptor , 2011, International Journal of Computer Vision.

[29]  Matthew Johnson,et al.  Generalized Descriptor Compression for Storage and Matching , 2010, BMVC.

[30]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[31]  Dieter Fox,et al.  Unsupervised Feature Learning for RGB-D Based Object Recognition , 2012, ISER.

[32]  Heinrich H. Bülthoff,et al.  Going into depth: Evaluating 2D and 3D cues for object classification on a new, large-scale object dataset , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[33]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[34]  W. Burgard,et al.  D Mapping with an RGB-D Camera , 2014 .

[35]  Tetsuya Takiguchi,et al.  3D-Object Recognition Based on LLC Using Depth Spatial Pyramid , 2014, 2014 22nd International Conference on Pattern Recognition.

[36]  Bernd Girod,et al.  Feature Matching Performance of Compact Descriptors for Visual Search , 2014, 2014 Data Compression Conference.

[37]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.