Dense Segmentation-Aware Descriptors

In this work we exploit segmentation to construct appearance descriptors that can robustly deal with occlusion and background changes. For this, we downplay measurements coming from areas that are unlikely to belong to the same region as the descriptor's center, as suggested by soft segmentation masks. Our treatment is applicable to any image point, i.e. dense, and its computational overhead is in the order of a few seconds. We integrate this idea with Dense SIFT, and also with Dense Scale and Rotation Invariant Descriptors (SID), delivering descriptors that are densely computable, invariant to scaling and rotation, and robust to background changes. We apply our approach to standard benchmarks on large displacement motion estimation using SIFT-flow and wide-baseline stereo, systematically demonstrating that the introduction of segmentation yields clear improvements.

[1]  Andrew Zisserman,et al.  Learning Local Feature Descriptors Using Convex Optimisation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Iasonas Kokkinos,et al.  Segmentation-Aware Deformable Part Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Kun Liu,et al.  Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates , 2014, International Journal of Computer Vision.

[4]  C. Lawrence Zitnick,et al.  Structured Forests for Fast Edge Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Cristian Sminchisescu,et al.  Efficient Closed-Form Solution to Generalized Boundary Detection , 2012, ECCV.

[6]  Andrew Zisserman,et al.  Descriptor Learning Using Convex Optimisation , 2012, ECCV.

[7]  Lihi Zelnik-Manor,et al.  On SIFTs and their scales , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Iasonas Kokkinos,et al.  Intrinsic shape context descriptors for deformable shapes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Stefan Roth,et al.  Learning rotation-aware features: From invariant priors to equivariant descriptors , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  A. Yuille,et al.  Dense Scale Invariant Descriptors for Images and Surfaces , 2012 .

[11]  Pietro Perona,et al.  Object detection and segmentation from joint embedding of parts and pixels , 2011, 2011 International Conference on Computer Vision.

[12]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[13]  Francesc Moreno-Noguer,et al.  Deformation and illumination invariant feature point descriptor , 2011, CVPR 2011.

[14]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[16]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[17]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[18]  Vincent Lepetit,et al.  DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Vincent Lepetit,et al.  Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Mark Everingham,et al.  Implicit color segmentation features for pedestrian and object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Matthew A. Brown,et al.  Picking the best DAISY , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Stefano Soatto,et al.  Localizing Objects with Smart Dictionaries , 2008, ECCV.

[23]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Different Scenes , 2008, ECCV.

[24]  Iasonas Kokkinos,et al.  Scale invariance without scale selection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Jitendra Malik,et al.  Using contours to detect and localize junctions in natural images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[28]  René Vidal,et al.  A Benchmark for the Comparison of 3-D Motion Segmentation Algorithms , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jian Yao,et al.  3D modeling and rendering from multiple wide-baseline images by match propagation , 2006, Signal Process. Image Commun..

[32]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[33]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[34]  Haibin Ling,et al.  Deformation invariant image matching , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[35]  Martial Hebert,et al.  Incorporating Background Invariance into Feature-Based Object Recognition , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[36]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[37]  R. Sukthankar,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[38]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[39]  Luc Van Gool,et al.  Dense matching of multiple wide-baseline views , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[40]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[42]  Joost van de Weijer,et al.  Fast Anisotropic Gauss Filtering , 2002, ECCV.

[43]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[44]  Jitendra Malik,et al.  Geometric blur for template matching , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[45]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[46]  George Wolberg,et al.  Robust image registration using log-polar transform , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[47]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[48]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[51]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Stéphane Mallat,et al.  Zero-crossings of a wavelet transform , 1991, IEEE Trans. Inf. Theory.

[53]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Yehoshua Y. Zeevi,et al.  The Generalized Gabor Scheme of Image Representation in Biological and Machine Vision , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Rachid Deriche,et al.  Using Canny's criteria to derive a recursively implemented optimal edge detector , 1987, International Journal of Computer Vision.

[56]  Gunilla Borgefors,et al.  Distance transformations in digital images , 1986, Comput. Vis. Graph. Image Process..

[57]  D. Casasent,et al.  Position, rotation, and scale invariant optical correlation. , 1976, Applied optics.

[58]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[59]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  W. Marsden I and J , 2012 .

[61]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Christopher Hunt SURF: Speeded-Up Robust Features , 2009 .

[63]  Wilson S. Geisler,et al.  Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..