Object detection using 2D spatial ordering constraints

Object detection is challenging partly due to the limited discriminative power of local feature descriptors. We amend this limitation by incorporating spatial constraints among neighboring features. We propose a two-step algorithm. First, a feature together with its spatial neighbors forms a flexible feature template. Two feature templates can be compared more informatively than two individual features without knowing the 3D object model. A large portion of false matches can be excluded after the first step. In a second global matching step, object detection is formulated as a graph-matching problem. A model graph is constructed by applying Delaunay triangulation on the surviving features. The best matching graph in an input image is computed by finding the maximum a posterior (MAP) estimate of a binary Markov random field with triangular maximal clique. The optimization is solved by the max-product algorithm (a.k.a. belief propagation). Experiments on both rigid and non-rigid objects demonstrate the generality and efficacy of the proposed methods.

[1]  Takeo Kanade,et al.  Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Pietro Perona,et al.  Recognition by Probabilistic Hypothesis Construction , 2004, ECCV.

[3]  Shinji Umeyama,et al.  An Eigendecomposition Approach to Weighted Graph Matching Problems , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.

[5]  Edwin R. Hancock,et al.  Graph Matching With a Dual-Step EM Algorithm , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Yali Amit,et al.  Joint Induction of Shape Features and Tree Classifiers , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Cordelia Schmid,et al.  3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[10]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[11]  Michael J. Black,et al.  On the unification of line processes, outlier rejection, and robust statistics with applications in early vision , 1996, International Journal of Computer Vision.

[12]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[13]  Michael I. Jordan Graphical Models , 2003 .

[14]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Luc Van Gool,et al.  Integrating multiple model views for object recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[16]  Andrew Zisserman,et al.  Multi-view Matching for Unordered Image Sets, or "How Do I Organize My Holiday Snaps?" , 2002, ECCV.

[17]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Luc Van Gool,et al.  Affine/ Photometric Invariants for Planar Intensity Patterns , 1996, ECCV.

[20]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.

[21]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[22]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[23]  Luc Van Gool,et al.  Wide-baseline multiple-view correspondences , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  Michael J. Black,et al.  On the unification of line processes , 1996 .