Keypoint detection in RGBD images based on an efficient viewpoint-covariant multiscale representation

Texture+depth (RGBD) images provide information about the geometry of a scene, which could help improve current image matching performance, e.g., in presence of large viewpoint changes. While depth has been mainly used for processing keypoint descriptors, in this paper we focus on the keypoint detection problem. In order to produce a computationally efficient viewpoint-covariant multiscale representation, we design an image smoothing procedure which locally smooths a texture image based on the corresponding depth. This yields an approximated scale space, where we can find keypoints using a multiscale detector approach. Our experiments on both synthetic and real-world images show substantial gains with respect to 2D and other RGBD feature extraction approaches.

[1]  Lars Bretzner,et al.  Real-Time Scale Selection in Hybrid Multi-scale Representations , 2003, Scale-Space.

[2]  Giuseppe Valenzise,et al.  A scale space for texture+depth images based on a discrete laplacian operator , 2015, 2015 IEEE International Conference on Multimedia and Expo (ICME).

[3]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[4]  Benjamin Bustos,et al.  Harris 3D: a robust extension of the Harris operator for interest point detection on 3D meshes , 2011, The Visual Computer.

[5]  David Suter,et al.  Feature Detection with an Improved Anisotropic Filter , 2006, ACCV.

[6]  Alan C. Bovik,et al.  Color and Depth Priors in Natural Images , 2013, IEEE Transactions on Image Processing.

[7]  Reinhard Koch,et al.  Perspectively Invariant Normal Features , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Le Xiao,et al.  SIPF: Scale invariant point feature for 3D point clouds , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[9]  Giuseppe Valenzise,et al.  An image smoothing operator for fast and accurate scale space approximation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[11]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[12]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[13]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[14]  Giuseppe Valenzise,et al.  Improving distinctiveness of brisk features using depth maps , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[15]  Federico Tombari,et al.  A combined texture-shape descriptor for enhanced 3D feature matching , 2011, 2011 18th IEEE International Conference on Image Processing.

[16]  J. Koenderink The structure of images , 2004, Biological Cybernetics.

[17]  Neil Robertson,et al.  21st European Signal Processing Conference , 2013 .

[18]  Yosi Keller,et al.  Scale-Invariant Features for 3-D Mesh Models , 2012, IEEE Transactions on Image Processing.

[19]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[20]  Adrien Bartoli,et al.  KAZE Features , 2012, ECCV.

[21]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[22]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[23]  J. Paul Siebert,et al.  Local feature extraction and matching on range images: 2.5D SIFT , 2009, Comput. Vis. Image Underst..

[24]  R. Horaud,et al.  Surface feature detection and description with applications to mesh matching , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Kurt Konolige,et al.  CenSurE: Center Surround Extremas for Realtime Feature Detection and Matching , 2008, ECCV.

[26]  Jan-Michael Frahm,et al.  3D model matching with Viewpoint-Invariant Patches (VIP) , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Alan C. Bovik,et al.  Natural scene statistics of color and range , 2011, 2011 18th IEEE International Conference on Image Processing.

[28]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Hongjian You,et al.  BFSIFT: A Novel Method to Find Feature Matches for SAR Image Registration , 2012, IEEE Geoscience and Remote Sensing Letters.

[30]  Mario Fernando Montenegro Campos,et al.  On the development of a robust, fast and lightweight keypoint descriptor , 2013, Neurocomputing.

[31]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Giuseppe Valenzise,et al.  Local visual features extraction from texture+depth content based on depth image analysis , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[33]  Wolfram Burgard,et al.  Point feature extraction on 3D range scans taking into account object boundaries , 2011, 2011 IEEE International Conference on Robotics and Automation.

[34]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[35]  Joachim Weickert,et al.  Anisotropic diffusion in image processing , 1996 .