Fast and bottom-up object detection, segmentation, and evaluation using Gestalt principles

In many scenarios, domestic robot will regularly encounter unknown objects. In such cases, top-down knowledge about the object for detection, recognition, and classification cannot be used. To learn about the object, or to be able to grasp it, bottom-up object segmentation is an important competence for the robot. Also when there is top-down knowledge, prior segmentation of the object can improve recognition and classification. In this paper, we focus on the problem of bottom-up detection and segmentation of unknown objects. Gestalt psychology studies the same phenomenon in human vision. We propose the utilization of a number of Gestalt principles. Our method starts by generating a set of hypotheses about the location of objects using symmetry. These hypotheses are then used to initialize the segmentation process. The main focus of the paper is on the evaluation of the resulting object segments using Gestalt principles to select segments with high figural goodness. The results show that the Gestalt principles can be successfully used for detection and segmentation of unknown objects. The results furthermore indicate that the Gestalt measures for the goodness of a segment correspond well with the objective quality of the segment. We exploit this to improve the overall segmentation performance.

[1]  Gert Kootstra,et al.  Predicting Eye Fixations on Complex Visual Stimuli Using Local Symmetry , 2011, Cognitive Computation.

[2]  Gert Kootstra,et al.  Fast and Automatic Detection and Segmentation of unknown objects , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[3]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[4]  O. Reiser,et al.  Principles Of Gestalt Psychology , 1936 .

[5]  Gert Kootstra,et al.  Using Symmetry to Select Fixation Points for Segmentation , 2010, 2010 20th International Conference on Pattern Recognition.

[6]  Patrick Pérez,et al.  Interactive Image Segmentation Using an Adaptive GMMRF Model , 2004, ECCV.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Danica Kragic,et al.  Active 3D scene segmentation and detection of unknown objects , 2010, 2010 IEEE International Conference on Robotics and Automation.

[9]  Toshiro Kubota,et al.  Salient Boundary Detection using Ratio Contour , 2003, NIPS.

[10]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[11]  Loong Fah Cheong,et al.  Active segmentation with fixation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Jitendra Malik,et al.  Cue Integration for Figure/Ground Labeling , 2005, NIPS.

[13]  Yehezkel Yeshurun,et al.  Preprocessing of Face Images: Detection of Features and Pose Normalization , 1998, Comput. Vis. Image Underst..

[14]  Shimon Ullman,et al.  Combined Top-Down/Bottom-Up Segmentation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Anat Levin,et al.  Learning to Combine Bottom-Up and Top-Down Segmentation , 2006, ECCV.

[16]  Stephen E. Palmer,et al.  Modern Theories of Gestalt Perception , 1990 .

[17]  S. Palmer Vision Science : Photons to Phenomenology , 1999 .

[18]  Jitendra Malik,et al.  Figure/Ground Assignment in Natural Images , 2006, ECCV.

[19]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[20]  Frédéric Jurie,et al.  Groups of Adjacent Contour Segments for Object Detection , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Jitendra Malik,et al.  Local figure-ground cues are valid for natural images. , 2007, Journal of vision.

[22]  Yehezkel Yeshurun,et al.  Context-free attentional operators: The generalized symmetry transform , 1995, International Journal of Computer Vision.

[23]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[24]  Gunther Heidemann,et al.  Focus-of-attention from local color symmetries , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.