Evaluation of Features Detectors and Descriptors Based on 3D Objects

We explore the performance of a number of popular feature detectors and descriptors in matching 3D object features across viewpoints and lighting conditions. To this end we design a method, based on intersecting epipolar constraints, for providing ground truth correspondence automatically. We collect a database of 100 objects viewed from 144 calibrated viewpoints under three different lighting conditions. We find that the combination of Hessian-affine feature finder and SIFT features is most robust to viewpoint change. Harris-affine combined with SIFT and Hessian-affine combined with shape context descriptors were best respectively for lighting changes and scale changes. We also find that no detector-descriptor combination performs well with viewpoint changes of more than 25-30/spl deg/.

[1]  J. G. Semple,et al.  Algebraic Projective Geometry , 1953 .

[2]  Paul Beaudet,et al.  Rotationally invariant image operators , 1978 .

[3]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[4]  James L. Crowley,et al.  A Representation for Shape Based on Peaks and Ridges in the Difference of Low-Pass Transform , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[6]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[9]  T. Lindeberg Scale-Space Theory : A Basic Tool for Analysing Structures at Different Scales , 1994 .

[10]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Tony Lindeberg,et al.  Shape-adapted smoothing in estimation of 3-D shape cues from affine deformations of local 2-D brightness structure , 1997, Image Vis. Comput..

[12]  Lucas J. van Vliet,et al.  Recursive Gaussian derivative filters , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[13]  P. Perona,et al.  Visual methods for three-dimensional modeling , 1999 .

[14]  Luc Van Gool,et al.  Wide Baseline Stereo Matching based on Local, Affinely Invariant Regions , 2000, BMVC.

[15]  Andrew Zisserman,et al.  Viewpoint invariant texture matching and wide baseline stereo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[16]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[17]  James J. Little,et al.  Mobile Robot Localization and Mapping with Uncertainty using Scale-Invariant Visual Landmarks , 2002, Int. J. Robotics Res..

[18]  Matthew A. Brown,et al.  Recognising panoramas , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[19]  Pietro Perona,et al.  Common-Frame Model for Object Recognition , 2004, NIPS.

[20]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[21]  T. Tuytelaars,et al.  Matching Widely Separated Views Based on Affine Invariant Regions , 2004, International Journal of Computer Vision.

[22]  Andrew Zisserman,et al.  An Affine Invariant Salient Region Detector , 2004, ECCV.

[23]  Manuel González,et al.  Affine Invariant Texture Segmentation and Shape from Texture by Variational Methods , 1998, Journal of Mathematical Imaging and Vision.

[24]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[25]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[26]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[27]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[28]  A. Jepson,et al.  Flexible spatial models for grouping local image features , 2004, CVPR 2004.

[29]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[31]  Bernt Schiele,et al.  Local features for object class recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[32]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..