Unsupervised Deep Learning of Compact Binary Descriptors

Binary descriptors have been widely used for efficient image matching and retrieval. However, most existing binary descriptors are designed with hand-craft sampling patterns or learned with label annotation provided by datasets. In this paper, we propose a new unsupervised deep learning approach, called DeepBit, to learn compact binary descriptor for efficient visual object matching. We enforce three criteria on binary descriptors which are learned at the top layer of the deep neural network: 1) minimal quantization loss, 2) evenly distributed codes and 3) transformation invariant bit. Then, we estimate the parameters of the network through the optimization of the proposed objectives with a back-propagation technique. Extensive experimental results on various visual recognition tasks demonstrate the effectiveness of the proposed approach. We further demonstrate our proposed approach can be realized on the simplified deep neural network, and enables efficient image matching and retrieval speed with very competitive accuracies.

[1]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[2]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[3]  Jiwen Lu,et al.  PCANet: A Simple Deep Learning Baseline for Image Classification? , 2014, IEEE Transactions on Image Processing.

[4]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[5]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[6]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Venkatesh Saligrama,et al.  Efficient Training of Very Deep Neural Networks for Supervised Hashing , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Shiliang Zhang,et al.  USB: Ultrashort Binary Descriptor for Fast Visual Matching and Retrieval , 2014, IEEE Transactions on Image Processing.

[10]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Krystian Mikolajczyk,et al.  BOLD - Binary online learned descriptor for efficient image matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  R. Venkatesh Babu,et al.  Object level deep feature pooling for compact image representation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[14]  Pascal Fua,et al.  Receptive Fields Selection for Binary Feature Description , 2014, IEEE Transactions on Image Processing.

[15]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[16]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[18]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, International Journal of Computer Vision.

[19]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Victor S. Lempitsky,et al.  Neural Codes for Image Retrieval , 2014, ECCV.

[21]  Qi Tian,et al.  Coupled Binary Embedding for Large-Scale Image Retrieval , 2014, IEEE Transactions on Image Processing.

[22]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[24]  Xin Yang,et al.  Local difference binary for ultrafast and distinctive feature description. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[25]  Bin Fan,et al.  L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Tinne Tuytelaars,et al.  Mining Mid-level Features for Image Classification , 2014, International Journal of Computer Vision.

[27]  Chu-Song Chen,et al.  Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Cordelia Schmid,et al.  Indexing Based on Scale Invariant Interest Points , 2001, ICCV.

[29]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[30]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[33]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[34]  Nikos Komodakis,et al.  Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Xuelong Li,et al.  Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.

[36]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[37]  Vincent Lepetit,et al.  Efficient Discriminative Projections for Compact Binary Descriptors , 2012, ECCV.

[38]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[39]  Vincent Lepetit,et al.  Learning Image Descriptors with Boosting , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[42]  Jiwen Lu,et al.  Learning Deep Binary Descriptor with Multi-Quantization , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Gustavo Carneiro,et al.  Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimizing Global Loss Functions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Krystian Mikolajczyk,et al.  PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors , 2016, ArXiv.

[45]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[46]  Gregory Shakhnarovich,et al.  Learning task-specific similarity , 2005 .

[47]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[49]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Rahul Sukthankar,et al.  MatchNet: Unifying feature and metric learning for patch-based matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Larry S. Davis,et al.  Exploiting local features from deep networks for image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[54]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Cordelia Schmid,et al.  Local Convolutional Features with Unsupervised Training for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[56]  Chen Huang,et al.  Unsupervised Learning of Discriminative Attributes and Visual Representations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[58]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[59]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[60]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[61]  Vincent Lepetit,et al.  Boosting Binary Keypoint Descriptors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[62]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[63]  Larry S. Davis,et al.  Supervised Incremental Hashing , 2016, BMVC.

[64]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Kristen Grauman,et al.  Learning Binary Hash Codes for Large-Scale Image Search , 2013, Machine Learning for Computer Vision.

[66]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[67]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[68]  Ngai-Man Cheung,et al.  Learning to Hash with Binary Deep Neural Network , 2016, ECCV.

[69]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[70]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.