Scene Understanding through Autonomous Interactive Perception

We propose a framework for detecting, extracting and modeling objects in natural scenes from multi-modal data. Our framework is iterative, exploiting different hypotheses in a complementary manner. We employ the framework in realistic scenarios, based on visual appearance and depth information. Using a robotic manipulator that interacts with the scene, object hypotheses generated using appearance information are confirmed through pushing. The framework is iterative, each generated hypothesis is feeding into the subsequent one, continuously refining the predictions about the scene. We show results that demonstrate the synergic effect of applying multiple hypotheses for real-world scene understanding. The method is efficient and performs in real-time.

[1]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Danica Kragic,et al.  Active 3D Segmentation through Fixation of Previously Unseen Objects , 2010, BMVC.

[3]  Martial Hebert,et al.  Towards unsupervised whole-object segmentation: Combining automated matting with boundary detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Niklas Bergström,et al.  Generating object hypotheses in natural scenes through human-robot interaction , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Danica Kragic,et al.  Enhanced visual scene understanding through human-robot dialog , 2011, IROS 2011.

[8]  Jiebo Luo,et al.  iCoseg: Interactive co-segmentation with intelligent scribble guidance , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Michal Irani,et al.  What Is a Good Image Segment? A Unified Approach to Segment Extraction , 2008, ECCV.

[10]  Oliver Brock,et al.  Interactive segmentation for manipulation in unstructured environments , 2009, 2009 IEEE International Conference on Robotics and Automation.

[11]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[12]  Edwin Olson,et al.  Graph-based segmentation for colored 3D laser point clouds , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Gert Kootstra,et al.  Fast and Automatic Detection and Segmentation of unknown objects , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[14]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[15]  Oliver Brock,et al.  Manipulating articulated objects with interactive perception , 2008, 2008 IEEE International Conference on Robotics and Automation.

[16]  René Vidal,et al.  Segmenting Motions of Different Types by Unsupervised Manifold Clustering , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Danica Kragic,et al.  Enhanced visual scene understanding through human-robot dialog , 2010, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Yiannis Aloimonos,et al.  Active Segmentation , 2009, Int. J. Humanoid Robotics.

[19]  Danica Kragic,et al.  Active 3D scene segmentation and detection of unknown objects , 2010, 2010 IEEE International Conference on Robotics and Automation.

[20]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[21]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yiannis Aloimonos,et al.  Active segmentation for robotics , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.