Fast depth estimation from single image using structured forest

Depth estimation from single image is an important component of many vision systems, including robot navigation, motion capture and video surveillance. In this paper, we propose to apply a structure forest framework to infer depth information from single RGB image. The core idea of our approach is to exploit the structure properties exhibit in local patches of depth map to learn the depth level for each pixel. We formulate the problem of depth estimation in a structured learning framework based on random decision forests. Each trained forest infers a patch of structured labels that are accumulated across the image to obtain the final depth map. Moreover, we systematically investigate a variety of depth-relevant features and the regression forest framework automatically determines the best feature combination and uses as input of structure forest. Our approach achieves quasi real-time performance that is orders of magnitude faster than state-of-the-art approaches, while also achieving state-of-the-art depth estimation results on the Make3D dataset.

[1]  Peter Kontschieder,et al.  Structured class-labels in random forests for semantic image labelling , 2011, 2011 International Conference on Computer Vision.

[2]  Ce Liu,et al.  Depth Extraction from Video Using Non-parametric Sampling , 2012, ECCV.

[3]  Ashutosh Saxena,et al.  Make3D: Learning 3D Scene Structure from a Single Still Image , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jan-Michael Frahm,et al.  Repetition-based dense single-view reconstruction , 2011, CVPR 2011.

[5]  Mohinder Malhotra Single Image Haze Removal Using Dark Channel Prior , 2016 .

[6]  Takeo Kanade,et al.  Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces , 2010, NIPS.

[7]  C. Lawrence Zitnick,et al.  Fast Edge Detection Using Structured Forests , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Xuming He,et al.  Discrete-Continuous Depth Estimation from a Single Image , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Antonio Torralba,et al.  Building a database of 3D scenes from user annotations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Alexei A. Efros,et al.  Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics , 2010, ECCV.

[11]  Ian D. Reid,et al.  Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Stephen Gould,et al.  Single image depth estimation from predicted semantic labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  David A. Forsyth,et al.  Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry , 2010, ECCV.

[15]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[16]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.