Occlusion Edge Detection in RGB-D Frames using Deep Convolutional Networks

BSTRACTOcclusion edges in images which correspond to range discontinuity in the scenefrom the point of view of the observer are an important prerequisite for manyvision and mobile robot tasks. Although they can be extracted from range datahowever extracting them from images and videos would be extremely beneficial.We trained a deep convolutional neural network (CNN) to identify occlusion edgesin images and videos with both RGB-D and RGB inputs. The use of CNN avoidshand-crafting of features for automatically isolating occlusion edges and distin-guishing them from appearance edges. Other than quantitative occlusion edgedetection results, qualitative results are provided to demonstrate the trade-off be-tween high resolution analysis and frame-level computation time which is criticalfor real-time robotics applications.

[1]  Yann LeCun,et al.  Indoor Semantic Segmentation using depth information , 2013, ICLR.

[2]  Yoshua Bengio,et al.  On the Expressive Power of Deep Architectures , 2011, ALT.

[3]  C. A. Burbeck,et al.  Occlusion edge blur: a cue to relative visual depth. , 1996, Journal of the Optical Society of America. A, Optics, image science, and vision.

[4]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[5]  S. Palmer,et al.  A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization. , 2012, Psychological bulletin.

[6]  Y-Lan Boureau,et al.  Learning Convolutional Feature Hierarchies for Visual Recognition , 2010, NIPS.

[7]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[8]  Martial Hebert,et al.  Occlusion Boundaries from Motion: Low-Level Detection and Mid-Level Reasoning , 2009, International Journal of Computer Vision.

[9]  Peter Kontschieder,et al.  Context-Sensitive Decision Forests for Object Detection , 2012, NIPS.

[10]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[12]  Paul Smith,et al.  Layered motion segmentation and depth ordering by tracking edges , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  B. S. Manjunath,et al.  Probabilistic occlusion boundary detection on spatio-temporal lattices , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Stefano Soatto,et al.  Detachable Object Detection with Efficient Model Selection , 2011, EMMCVPR.

[15]  Truong Q. Nguyen,et al.  An Online Learning Approach to Occlusion Boundary Detection , 2012, IEEE Transactions on Image Processing.

[16]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[17]  Nicolas Le Roux,et al.  Representational Power of Restricted Boltzmann Machines and Deep Belief Networks , 2008, Neural Computation.

[18]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[19]  Óscar Martínez Mozos,et al.  A comparative evaluation of interest point detectors and local descriptors for visual SLAM , 2010, Machine Vision and Applications.

[20]  David Mumford,et al.  A Bayesian treatment of the stereo correspondence problem using half-occluded regions , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Ralph Gross,et al.  Concurrent Object Recognition and Segmentation by Graph Partitioning , 2002, NIPS.

[22]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[23]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Benjamin Bustos,et al.  Harris 3D: a robust extension of the Harris operator for interest point detection on 3D meshes , 2011, The Visual Computer.

[25]  Jitendra Malik,et al.  Occlusion boundary detection and figure/ground assignment from optical flow , 2011, CVPR 2011.

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.