Recognizing objects by piecing together the Segmentation Puzzle

We present an algorithm that recognizes objects of a given category using a small number of hand segmented images as references. Our method first over segments an input image into superpixels, and then finds a shortlist of optimal combinations of superpixels that best fit one of template parts, under affine transformations. Second, we develop a contextual interpretation of the parts, gluing image segments using top-down fiducial points, and checking overall shape similarity. In contrast to previous work, the search for candidate superpixel combinations is not exponential in the number of segments, and in fact leads to a very efficient detection scheme. Both the storage and the detection of templates only require space and time proportional to the length of the template boundary, allowing us to store potentially millions of templates, and to detect a template anywhere in a large image in roughly 0.01 seconds. We apply our algorithm on the Weizmann horse database, and show our method is comparable to the state of the art while offering a simpler and more efficient alternative compared to previous work.

[1]  S. Sclaroff,et al.  Region segmentation via deformable model-guided split and merge , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[2]  Stan Sclaroff,et al.  Region Segmentation via Deformable Model-Guided Split and Merge , 2001, ICCV.

[3]  Anat Levin,et al.  Learning to Combine Bottom-Up and Top-Down Segmentation , 2006, ECCV.

[4]  Anat Levin,et al.  Learning to Combine Bottom-Up and Top-Down Segmentation , 2006, International Journal of Computer Vision.

[5]  Shimon Ullman,et al.  Class-Specific, Top-Down Segmentation , 2002, ECCV.

[6]  Jitendra Malik,et al.  Shape Guided Object Segmentation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  Greg Mori,et al.  Guiding model search using segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[9]  J. P. Lewis,et al.  Fast Template Matching , 2009 .

[10]  Andrew Zisserman,et al.  OBJ CUT , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Zhuowen Tu,et al.  Image Parsing: Segmentation, Detection, and Recognition , 2003 .

[12]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[13]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[14]  Jamie Shotton,et al.  The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Ralph Gross,et al.  Concurrent Object Recognition and Segmentation by Graph Partitioning , 2002, NIPS.

[16]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[17]  Shimon Ullman,et al.  Combining Top-Down and Bottom-Up Segmentation , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.