Hierarchical Object Discovery and Dense Modelling From Motion Cues in RGB-D Video

In this paper, we propose a novel method for object discovery and dense modelling in RGB-D image sequences using motion cues. We develop our method as a building block for active object perception, such that robots can learn about the environment through perceiving the effects of actions. Our approach simultaneously segments rigid-body motion within key views, and discovers objects and hierarchical relations between object parts. The poses of the key views are optimized in a graph of spatial relations to recover the rigid-body motion trajectories of the camera with respect to the objects. In experiments, we demonstrate that our approach finds moving objects, aligns partial views on the objects, and retrieves hierarchical relations between the objects.

[1]  Kurt Konolige,et al.  Real-Time Detection of Independent Motion using Stereo , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[2]  Daniel Cremers,et al.  Motion Competition: A Variational Approach to Piecewise Parametric Motion Segmentation , 2005, International Journal of Computer Vision.

[3]  J. Andrew Bagnell,et al.  Interactive segmentation, tracking, and kinematic modeling of unknown 3D articulated objects , 2013, 2013 IEEE International Conference on Robotics and Automation.

[4]  Oliver Brock,et al.  Interactive segmentation for manipulation in unstructured environments , 2009, 2009 IEEE International Conference on Robotics and Automation.

[5]  Jörg Stückler,et al.  Model Learning and Real-Time Tracking Using Multi-Resolution Surfel Maps , 2012, AAAI.

[6]  Wolfram Burgard,et al.  A Probabilistic Framework for Learning Kinematic Models of Articulated Objects , 2011, J. Artif. Intell. Res..

[7]  Hugh F. Durrant-Whyte,et al.  Simultaneous Localization, Mapping and Moving Object Tracking , 2007, Int. J. Robotics Res..

[8]  Anton Osokin,et al.  Fast Approximate Energy Minimization with Label Costs , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  T.M.E. Frost,et al.  Moving object detection and motion estimation , 1990 .

[10]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[11]  Lourdes Agapito,et al.  Dense multibody motion estimation and reconstruction from a handheld camera , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[12]  Fabio Tozeto Ramos,et al.  Motion clustering and estimation with conditional random fields , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Paul M. Fitzpatrick,et al.  First contact: an active vision approach to segmentation , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[14]  Andrew Zisserman,et al.  Learning Layered Motion Segmentations of Video , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Wolfram Burgard,et al.  CRF-Matching: Conditional Random Fields for Feature-Based Scan Matching , 2008 .

[16]  Y. Weiss,et al.  Multibody factorization with uncertainty and missing data using the EM algorithm , 2004, CVPR 2004.

[17]  Lihi Zelnik-Manor,et al.  Multi-body Factorization with Uncertainty: Revisiting Motion Consistency , 2005, International Journal of Computer Vision.

[18]  Sebastian Thrun,et al.  Learning Hierarchical Object Maps of Non-Stationary Environments with Mobile Robots , 2002, UAI.

[19]  Fabio Tozeto Ramos,et al.  An integrated probabilistic model for scan-matching, moving object detection and motion estimation , 2010, 2010 IEEE International Conference on Robotics and Automation.

[20]  Yuri Boykov,et al.  Interactive Segmentation , 2014, Computer Vision, A Reference Guide.

[21]  Rachid Deriche,et al.  A Review of Statistical Approaches to Level Set Segmentation: Integrating Color, Texture, Motion and Shape , 2007, International Journal of Computer Vision.

[22]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[23]  Markus H. Gross,et al.  Multi‐scale Feature Extraction on Point‐Sampled Surfaces , 2003, Comput. Graph. Forum.

[24]  Horst Bischof,et al.  Joint motion estimation and segmentation of complex scenes with label costs and occlusion modeling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Cordelia Schmid,et al.  Segmenting, Modeling, and Matching Video Clips Containing Multiple Moving Objects , 2007, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Cristian Sminchisescu,et al.  CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[28]  Sven Behnke,et al.  Fast Range Image Segmentation and Smoothing Using Approximate Surface Reconstruction and Region Growing , 2012, IAS.

[29]  J. Ponce,et al.  Segmenting, modeling, and matching video clips containing multiple moving objects , 2004, CVPR 2004.

[30]  Dieter Fox,et al.  RGB-D object discovery via multi-scene analysis , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Richard S. Zemel,et al.  Learning Articulated Structure and Motion , 2010, International Journal of Computer Vision.

[32]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Stefano Soatto,et al.  Motion segmentation with occlusions on the superpixel graph , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[34]  David Suter,et al.  Two-View Multibody Structure-and-Motion with Outliers through Model Selection , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Huimin Yu,et al.  3D Video Based Segmentation and Motion Estimation with Active Surface Evolution , 2013, J. Signal Process. Syst..

[36]  Hujun Bao,et al.  Simultaneous multi-body stereo and segmentation , 2011, 2011 International Conference on Computer Vision.

[37]  Wolfram Burgard,et al.  Map building with mobile robots in dynamic environments , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).