Exploring the Limitations of the Convolutional Neural Networks on Binary Tests Selection for Local Features

Convolutional Neural Networks (CNN) have been successfully used to recognize and extract visual patterns in different tasks such as object detection, object classification, scene recognition, and image retrieval. The CNNs have also contributed in local features extraction by learning local representations. A representative approach is LIFT that generates keypoint descriptors more discriminative than handcrafted algorithms like SIFT, BRIEF, and SURF. In this paper, we investigate the binary tests selection problem, and we present an in-depth study of the limit of searching solutions with CNNs when the gradient is computed from the local neighborhood of the selected pixels. We performed several experiments with a Siamese Network trained with corresponding and non-corresponding patch pairs. Our results show the presence of Local Minima and also a problem that we called Incorrect Gradient Components. We pursued to understand the binary tests selection problem and even some limitations of Convolutional Neural Networks to avoid searching for solutions in unviable directions.

[1]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  C. Lawrence Zitnick,et al.  Edge foci interest points , 2011, 2011 International Conference on Computer Vision.

[3]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[6]  Vincent Lepetit,et al.  Learning to Assign Orientations to Feature Points , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[8]  Krystian Mikolajczyk,et al.  BOLD - Binary online learned descriptor for efficient image matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[11]  Matthew A. Brown,et al.  Learning Local Image Descriptors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Lu Tian,et al.  OSRI: A Rotationally Invariant Binary Descriptor , 2014, IEEE Transactions on Image Processing.

[13]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[16]  Jiwen Lu,et al.  Learning Deep Binary Descriptor with Multi-Quantization , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.