Treelets Binary Feature Retrieval for Fast Keypoint Recognition

Fast keypoint recognition is essential to many vision tasks. In contrast to the classification-based approaches, we directly formulate the keypoint recognition as an image patch retrieval problem, which enjoys the merit of finding the matched keypoint and its pose simultaneously. To effectively extract the binary features from each patch surrounding the keypoint, we make use of treelets transform that can group the highly correlated data together and reduce the noise through the local analysis. Treelets is a multiresolution analysis tool, which provides an orthogonal basis to reflect the geometry of the noise-free data. To facilitate the real-world applications, we have proposed two novel approaches. One is the convolutional treelets that capture the image patch information locally and globally while reducing the computational cost. The other is the higher-order treelets that reflect the relationship between the rows and columns within image patch. An efficient sub-signature-based locality sensitive hashing scheme is employed for fast approximate nearest neighbor search in patch retrieval. Experimental evaluations on both synthetic data and the real-world Oxford dataset have shown that our proposed treelets binary feature retrieval methods outperform the state-of-the-art feature descriptors and classification-based approaches.

[1]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[2]  Ann B. Lee,et al.  Treelets | A Tool for Dimensionality Reduction and Multi-Scale Analysis of Unstructured Data , 2007, AISTATS.

[3]  Chun Chen,et al.  A Convolutional Treelets Binary Feature Approach to Fast Keypoint Recognition , 2012, ECCV.

[4]  Alejandro F. Frangi,et al.  Two-dimensional PCA: a new approach to appearance-based face representation and recognition , 2004 .

[5]  Vincent Lepetit,et al.  BRIEF: Computing a Local Binary Descriptor Very Fast , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[7]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[8]  Jan-Michael Frahm,et al.  Comparative Evaluation of Binary Features , 2012, ECCV.

[9]  Vincent Lepetit,et al.  Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  J. Leeuw,et al.  Principal component analysis of three-mode data by means of alternating least squares algorithms , 1980 .

[11]  Cordelia Schmid,et al.  3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints , 2006, International Journal of Computer Vision.

[12]  Michael R. Lyu,et al.  A Fast 2D Shape Recovery Approach by Fusing Features and Appearance , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Zenglin Xu,et al.  An Effective Approach to 3D Deformable Surface Tracking , 2008, ECCV.

[14]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[16]  Jean-Michel Morel,et al.  ASIFT: A New Framework for Fully Affine Invariant Image Comparison , 2009, SIAM J. Imaging Sci..

[17]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[18]  Zi Huang,et al.  Robust Hashing With Local Models for Approximate Similarity Search , 2014, IEEE Transactions on Cybernetics.

[19]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[20]  Chun Chen,et al.  Semi-Supervised Nonlinear Hashing Using Bootstrap Sequential Projection Learning , 2013, IEEE Transactions on Knowledge and Data Engineering.

[21]  Vincent Lepetit,et al.  Keypoint Signatures for Fast Learning and Recognition , 2008, ECCV.

[22]  Vincent Lepetit,et al.  Noname manuscript No. (will be inserted by the editor) Learning Real-Time Perspective Patch Rectification , 2022 .

[23]  Steven Mills,et al.  Correcting Scale Drift by Object Recognition in Single-Camera SLAM , 2013, IEEE Transactions on Cybernetics.

[24]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Michael R. Lyu,et al.  Progressive Finite Newton Approach To Real-time Nonrigid Surface Detection , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Samy Bengio,et al.  The Handbook of Brain Theory and Neural Networks , 2002 .

[27]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[28]  Vincent Lepetit,et al.  Keypoint recognition using randomized trees , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[30]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[31]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[32]  Deng Cai,et al.  Density Sensitive Hashing , 2012, IEEE Transactions on Cybernetics.

[33]  Vincent Lepetit,et al.  Efficient Discriminative Projections for Compact Binary Descriptors , 2012, ECCV.

[34]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[35]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Antonio Torralba,et al.  Multidimensional Spectral Hashing , 2012, ECCV.

[38]  Deng Cai,et al.  Tensor Subspace Analysis , 2005, NIPS.

[39]  Jun Wang,et al.  Self-taught hashing for fast similarity search , 2010, SIGIR.

[40]  Gene H. Golub,et al.  Rank-One Approximation to High Order Tensors , 2001, SIAM J. Matrix Anal. Appl..

[41]  Quoc V. Le,et al.  Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.

[42]  Hung-Khoon Tan,et al.  Near-Duplicate Keyframe Identification With Interest Point Matching and Pattern Learning , 2007, IEEE Transactions on Multimedia.

[43]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[44]  Demetri Terzopoulos,et al.  Multilinear Analysis of Image Ensembles: TensorFaces , 2002, ECCV.

[45]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[46]  Demetri Terzopoulos,et al.  Multilinear subspace analysis of image ensembles , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[47]  Vincent Lepetit,et al.  Point matching as a classification problem for fast and robust object pose estimation , 2004, CVPR 2004.

[48]  Yuichi Yoshida,et al.  CARD: Compact And Real-time Descriptors , 2011, 2011 International Conference on Computer Vision.

[49]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[50]  L. Lathauwer,et al.  Signal Processing based on Multilinear Algebra , 1997 .

[51]  David J. Fleet,et al.  Fast search in Hamming space with multi-index hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.