Depth Estimation with Multi-Resolution Stereo Matching

Depth estimation has widely demands in autopilot and scene reconstruction. Although depth estimation has been greatly improved by deep learning, there is still some room for improvement. Usually, stereo-matching-based depth estimation matches low-resolution features and then up-samples depth map to full-resolution. Such methods suffer from low accuracy because of information loss in low-resolution features. To solve this problem, a depth estimation method via multi-resolution gradual-refining stereo matching is proposed. As with classic methods, this method first extracts pyramid features by convolution network and estimates low-resolution depth maps. Then, as an innovation, stereo-matching at each pyramid feature is successively executed using the low-resolution maps as initial depth, which limits search range of stereo-matching for accuracy and efficiency. Results of stereo-matching are used as residuals to refine the depth maps. Experimental results demonstrate that accuracy of depth estimation by the proposed method is significantly improved, but computational complexity does not much increase compared with classic methods.

[1]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Luigi di Stefano,et al.  Real-Time Self-Adaptive Deep Stereo , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Qiong Yan,et al.  Cascade Residual Learning: A Two-Stage Convolutional Neural Network for Stereo Matching , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[4]  Yong-Sheng Chen,et al.  Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Shahram Izadi,et al.  StereoNet: Guided Hierarchical Refinement for Edge-Aware Depth Prediction , 2018 .

[7]  Kyoung Mu Lee,et al.  Look Wider to Match Image Patches With Convolutional Neural Networks , 2017, IEEE Signal Processing Letters.

[8]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Thomas Pock,et al.  End-to-End Training of Hybrid CNN-CRF Models for Stereo , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[12]  Raquel Urtasun,et al.  Efficient Deep Learning for Stereo Matching , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).