GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning

In this paper, we propose a GraphBit method to learn deep binary descriptors in a directed acyclic graph unsupervisedly, representing bitwise interactions as edges between the nodes of bits. Conventional binary representation learning methods enforce each element to be binarized into zero or one. However, there are elements lying in the boundary which suffer from doubtful binarization as "ambiguous bits". Ambiguous bits fail to collect effective information for confident binarization, which are unreliable and sensitive to noise. We argue that there are implicit inner relationships between bits in binary descriptors, where the related bits can provide extra instruction as prior knowledge for ambiguity elimination. Specifically, we design a deep reinforcement learning model to learn the structure of the graph for bitwise interaction mining, reducing the uncertainty of binary codes by maximizing the mutual information with inputs and related bits, so that the ambiguous bits receive additional instruction from the graph for confident binarization. Due to the reliability of the proposed binary codes with bitwise interaction, we obtain an average improvement of 9.64%, 8.84% and 3.22% on the CIFAR-10, Brown and HPatches datasets respectively compared with the state-of-the-art unsupervised binary descriptors.

[1]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[4]  Gang Hua,et al.  Collaborative Deep Reinforcement Learning for Joint Object Search , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Pascal Fua,et al.  Receptive Fields Selection for Binary Feature Description , 2014, IEEE Transactions on Image Processing.

[6]  Vincent Lepetit,et al.  Efficient Discriminative Projections for Compact Binary Descriptors , 2012, ECCV.

[7]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[9]  Jiwen Lu,et al.  Learning Deep Binary Descriptor with Multi-Quantization , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[11]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[12]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[13]  Vincent Lepetit,et al.  Boosting Binary Keypoint Descriptors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[15]  Andrea Vedaldi,et al.  HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[17]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[18]  Shiguang Shan,et al.  Learning Multifunctional Binary Codes for Both Category and Attribute Oriented Retrieval Tasks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Liang Lin,et al.  Attention-Aware Face Hallucination via Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Jin Young Choi,et al.  Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[23]  Shih-Fu Chang,et al.  Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[25]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[26]  Patrick Pérez,et al.  SuBiC: A Supervised, Structured Binary Code for Image Search , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Ling Shao,et al.  Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  David Barber,et al.  The IM algorithm: a variational approach to Information Maximization , 2003, NIPS 2003.

[30]  Jian Sun,et al.  K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Jiwen Lu,et al.  Learning Compact Binary Face Descriptor for Face Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[34]  Vincent Lepetit,et al.  Learning Image Descriptors with Boosting , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[36]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[37]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Svetlana Lazebnik,et al.  Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[40]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[41]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[42]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[43]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[44]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[45]  Michael L. Littman,et al.  Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.

[46]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[48]  Eric P. Xing,et al.  Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Jiwen Lu,et al.  Attention-Aware Deep Reinforcement Learning for Video Face Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.