Combining Regions and Patches for Object Class Localization

We introduce a method for object class detection and localization which combines regions generated by image segmentation with local patches. Region-based descriptors can model and match regular textures reliably, but fail on parts of the object which are textureless. They also cannot repeatably identify interest points on their boundaries. By incorporating information from patch-based descriptors near the regions into a new feature, the Region-based Context Feature (RCF), we can address these issues. We apply Region-based Context Features in a semi-supervised learning framework for object detection and localization. This framework produces object-background segmentation masks of deformable objects. Numerical results are presented for pixel-level performance.

[1]  Tony Lindeberg,et al.  Shape-Adapted Smoothing in Estimation of 3-D Depth Cues from Affine Distortions of Local 2-D Brightness Structure , 1994, ECCV.

[2]  Jitendra Malik,et al.  Textons, contours and regions: cue integration in image segmentation , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[4]  Pietro Perona,et al.  Viewpoint-invariant learning and detection of human heads , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[5]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[6]  Cordelia Schmid,et al.  Constructing models for content-based image retrieval , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[7]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Jitendra Malik,et al.  Estimating Human Body Configurations Using Shape Context Matching , 2002, ECCV.

[9]  Jitendra Malik,et al.  Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Martial Hebert,et al.  Discriminative Fields for Modeling Spatial Dependencies in Natural Images , 2003, NIPS.

[11]  B. Schiele,et al.  Interleaved Object Categorization and Segmentation , 2003, BMVC.

[12]  Antonio Torralba,et al.  Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[13]  Cordelia Schmid,et al.  Selection of scale-invariant parts for object class recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Zhuowen Tu,et al.  Image Parsing: Unifying Segmentation, Detection, and Recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[17]  Jianbo Shi,et al.  Object-Specific Figure-Ground Segregation , 2003, CVPR.

[18]  Bernt Schiele,et al.  Scale-Invariant Object Categorization Using a Scale-Adaptive Mean-Shift Search , 2004, DAGM-Symposium.

[19]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[20]  Andrew Zisserman,et al.  An Affine Invariant Salient Region Detector , 2004, ECCV.

[21]  Cordelia Schmid,et al.  Semi-Local Affine Parts for Object Recognition , 2004, BMVC.

[22]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[23]  Dan Roth,et al.  Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Andrew Zisserman,et al.  OBJ CUT , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[26]  Axel Pinz,et al.  Object Localization with Boosting and Weak Supervision for Generic Object Recognition , 2005, SCIA.

[27]  Bernt Schiele,et al.  Interleaving Object Categorization and Segmentation , 2006, Cognitive Vision Systems.