Generating object hypotheses in natural scenes through human-robot interaction

We propose a method for interactive modeling of objects and object relations based on real-time segmentation of video sequences. In interaction with a human, the robot can perform multi-object segmentation through principled modeling of physical constraints. The key contribution is an efficient multi-labeling framework, that allows object modeling and disambiguation in natural scenes. Object modeling and labeling is done in a real-time segmentation system, to which hypotheses and constraints denoting relations between objects can be added incrementally. Through instructions such as key presses or spoken words, a scene can be segmented in regions corresponding to multiple physical objects. The approach solves some of the difficult problems related to disambiguation of objects merged due to their direct physical contact. Results show that even a limited set of simple interactions with a human operator can substantially improve segmentation results.

[1]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[3]  A. Leonardis,et al.  A basic cognitive system for interactive continuous learning of visual concepts , 2010 .

[4]  Gert Kootstra,et al.  Using Symmetry to Select Fixation Points for Segmentation , 2010, 2010 20th International Conference on Pattern Recognition.

[5]  Donald Geman,et al.  Boundary Detection by Constrained Optimization , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Henrik I. Christensen,et al.  Clarification dialogues in human-augmented mapping , 2006, HRI '06.

[7]  Danica Kragic,et al.  Enhanced visual scene understanding through human-robot dialog , 2010, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Wolfram Burgard,et al.  Conceptual spatial representations for indoor mobile robots , 2008, Robotics Auton. Syst..

[9]  R. B. Potts Some generalized order-disorder transformations , 1952, Mathematical Proceedings of the Cambridge Philosophical Society.

[10]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[11]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Jian Sun,et al.  Lazy snapping , 2004, SIGGRAPH 2004.

[13]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[15]  Danica Kragic,et al.  Active 3D Segmentation through Fixation of Previously Unseen Objects , 2010, BMVC.

[16]  Yong Jae Lee,et al.  Collect-cut: Segmentation with top-down cues discovered in multi-object images , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Danica Kragic,et al.  Active 3D scene segmentation and detection of unknown objects , 2010, 2010 IEEE International Conference on Robotics and Automation.

[18]  Michal Irani,et al.  What Is a Good Image Segment? A Unified Approach to Segment Extraction , 2008, ECCV.

[19]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Joachim M. Buhmann,et al.  Towards weakly supervised semantic segmentation by means of multiple instance and multitask learning , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Loong Fah Cheong,et al.  Active segmentation with fixation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Vladimir Kolmogorov,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..