Learning Deep Binary Descriptor with Multi-Quantization

In this paper, we propose an unsupervised feature learning method called deep binary descriptor with multi-quantization (DBD-MQ) for visual analysis. Existing learning-based binary descriptors such as compact binary face descriptor (CBFD) and DeepBit utilize the rigid sign function for binarization despite of data distributions, which usually suffer from severe quantization loss. In order to address the limitation, we propose a deep multi-quantization network to learn a data-dependent binarization in an unsupervised manner. More specifically, we design a K-Autoencoders (KAEs) network to jointly learn the parameters of feature extractor and the binarization functions under a deep learning framework, so that discriminative binary descriptors can be obtained with a fine-grained multi-quantization. As DBD-MQ simply allocates the same number of quantizers to each real-valued feature dimension ignoring the elementwise diversity of informativeness, we further propose a deep competitive binary descriptor with multi-quantization (DCBD-MQ) method to learn optimal allocation of bits with the fixed binary length in a competitive manner, where informative dimensions gain more bits for complete representation. Moreover, we present a similarity-aware binary encoding strategy based on the earth mover's distance of Autoencoders, so that elements that are quantized into similar Autoencoders will have smaller Hamming distances. Extensive experimental results on six widely-used datasets show that our DBD-MQ and DCBD-MQ outperform most state-of-the-art unsupervised binary descriptors.

[1]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[3]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[6]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Ngai-Man Cheung,et al.  Learning to Hash with Binary Deep Neural Network , 2016, ECCV.

[9]  Jiwen Lu,et al.  Simultaneous Local Binary Feature Learning and Encoding for Homogeneous and Heterogeneous Face Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[11]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jiwen Lu,et al.  Learning Compact Binary Face Descriptor for Face Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Ling Shao,et al.  Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Andrea Vedaldi,et al.  HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[16]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[20]  Thomas Brox,et al.  Descriptor Matching with Convolutional Neural Networks: a Comparison to SIFT , 2014, ArXiv.

[21]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[22]  Patrick Pérez,et al.  SuBiC: A Supervised, Structured Binary Code for Image Search , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[24]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[25]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[26]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jiwen Lu,et al.  Learning Rotation-Invariant Local Binary Descriptor , 2017, IEEE Transactions on Image Processing.

[30]  Cordelia Schmid,et al.  Local Convolutional Features with Unsupervised Training for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[32]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[35]  Vincent Lepetit,et al.  Learning Image Descriptors with Boosting , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Pascal Fua,et al.  Receptive Fields Selection for Binary Feature Description , 2014, IEEE Transactions on Image Processing.

[37]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[38]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[39]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Xueming Qian,et al.  PLBP: An effective local binary patterns texture descriptor with pyramid representation , 2011, Pattern Recognit..

[42]  Krystian Mikolajczyk,et al.  Binary Online Learned Descriptors , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Larry S. Davis,et al.  Exploiting local features from deep networks for image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[44]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Shih-Fu Chang,et al.  Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[47]  Jiwen Lu,et al.  Learning Deep Binary Descriptor with Multi-Quantization , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[49]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[50]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[51]  Vincent Lepetit,et al.  Efficient Discriminative Projections for Compact Binary Descriptors , 2012, ECCV.

[52]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, International Journal of Computer Vision.

[54]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Jian Sun,et al.  K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Yan Lu,et al.  Local Descriptors Optimized for Average Precision , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[59]  Vincent Lepetit,et al.  Boosting Binary Keypoint Descriptors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[61]  Rong Xiao,et al.  Pairwise Rotation Invariant Co-Occurrence Local Binary Pattern , 2014, IEEE Trans. Pattern Anal. Mach. Intell..

[62]  Shiguang Shan,et al.  Learning Multifunctional Binary Codes for Both Category and Attribute Oriented Retrieval Tasks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Victor S. Lempitsky,et al.  Neural Codes for Image Retrieval , 2014, ECCV.

[64]  Bin Fan,et al.  L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[66]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Gregory Shakhnarovich,et al.  Learning task-specific similarity , 2005 .

[68]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[70]  Shiliang Zhang,et al.  USB: Ultrashort Binary Descriptor for Fast Visual Matching and Retrieval , 2014, IEEE Transactions on Image Processing.

[71]  Jiwen Lu,et al.  Context-Aware Local Binary Feature Learning for Face Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[73]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).