Using Deep Convolutional Networks for Occlusion Edge Detection in RGB-D Frames

Occlusion edges in images correspond to range discontinuity in the scene from the point of view of the observer. Occlusion edge detection is an important prerequisite for many vision and mobile robot tasks. Although occlusion edges can be extracted from range data, extracting them from images and videos is challenging and would be extremely beneficial for a variety of robotics applications. We trained a deep convolutional neural network (CNN) to identify occlusion edges in images and videos with both RGB-D and RGB inputs. The use of CNN avoids hand-crafting of features for automatically isolating occlusion edges and distinguishing them from appearance edges. In addition to quantitative occlusion edge detection results, qualitative results are provided to demonstrate the trade-off between high resolution analysis and frame-level computation time which is critical for real-time robotics applications.

[1]  Yann LeCun,et al.  Indoor Semantic Segmentation using depth information , 2013, ICLR.

[2]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[5]  Peter Kontschieder,et al.  Context-Sensitive Decision Forests for Object Detection , 2012, NIPS.

[6]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[7]  Truong Q. Nguyen,et al.  An Online Learning Approach to Occlusion Boundary Detection , 2012, IEEE Transactions on Image Processing.

[8]  S. Palmer,et al.  A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization. , 2012, Psychological bulletin.

[9]  Benjamin Bustos,et al.  Harris 3D: a robust extension of the Harris operator for interest point detection on 3D meshes , 2011, The Visual Computer.

[10]  Yoshua Bengio,et al.  On the Expressive Power of Deep Architectures , 2011, ALT.

[11]  Stefano Soatto,et al.  Detachable Object Detection with Efficient Model Selection , 2011, EMMCVPR.

[12]  Jitendra Malik,et al.  Occlusion boundary detection and figure/ground assignment from optical flow , 2011, CVPR 2011.

[13]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[14]  Y-Lan Boureau,et al.  Learning Convolutional Feature Hierarchies for Visual Recognition , 2010, NIPS.

[15]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[16]  B. S. Manjunath,et al.  Probabilistic occlusion boundary detection on spatio-temporal lattices , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[18]  Martial Hebert,et al.  Occlusion Boundaries from Motion: Low-Level Detection and Mid-Level Reasoning , 2009, International Journal of Computer Vision.

[19]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[20]  Nicolas Le Roux,et al.  Representational Power of Restricted Boltzmann Machines and Deep Belief Networks , 2008, Neural Computation.

[21]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[22]  Paul Smith,et al.  Layered motion segmentation and depth ordering by tracking edges , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ralph Gross,et al.  Concurrent Object Recognition and Segmentation by Graph Partitioning , 2002, NIPS.

[24]  C. A. Burbeck,et al.  Occlusion edge blur: a cue to relative visual depth. , 1996, Journal of the Optical Society of America. A, Optics, image science, and vision.

[25]  David Mumford,et al.  A Bayesian treatment of the stereo correspondence problem using half-occluded regions , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.