Unsupervised stereo matching using correspondence consistency

Deep convolutional neural networks (CNNs) have shown revolutionary performance improvements for matching cost computation in stereo matching. However, conventional CNN-based approaches to learn the network in a supervised manner require a large number of ground-truth disparity maps, which limits their applicability. To overcome this limitation, we present a novel framework to learn a CNNs architecture for matching cost computation in an unsupervised manner. Our method leverages an image domain learning combined with stereo epipolar constraints. Exploiting the correspondence consistency between stereo images as supervision, our method selects the training samples in each iteration during network training and uses them to learn the network. To boost the performance, we also propose a multi-scale cost computation scheme. Experimental results show that our method outperforms the state-of-the-art methods including even supervised learning based methods on various benchmarks.

[1]  Liang Wang,et al.  A Deep Visual Correspondence Embedding Model for Stereo Matching Costs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[4]  Minh N. Do,et al.  Probability-Based Rendering for View Synthesis , 2014, IEEE Transactions on Image Processing.

[5]  Raquel Urtasun,et al.  Efficient Deep Learning for Stereo Matching , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ian D. Reid,et al.  Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Yi Yang,et al.  Attention to Scale: Scale-Aware Semantic Image Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Kwanghoon Sohn,et al.  Depth Analogy: Data-Driven Approach for Single Image Depth Estimation Using Gradient Samples , 2015, IEEE Transactions on Image Processing.

[9]  Jitendra Malik,et al.  Learning to See by Moving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[11]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[12]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[13]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[14]  Seungryong Kim,et al.  Mahalanobis Distance Cross-Correlation for Illumination-Invariant Stereo Matching , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Xing Mei,et al.  Real-time local stereo via edge-aware disparity propagation , 2014, Pattern Recognit. Lett..

[17]  Silvio Savarese,et al.  Universal Correspondence Network , 2016, NIPS.

[18]  Francesc Moreno-Noguer,et al.  DaLI: Deformation and Light Invariant Descriptor , 2015, International Journal of Computer Vision.

[19]  Gustavo Carneiro,et al.  Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue , 2016, ECCV.

[20]  Oisin Mac Aodha,et al.  Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[22]  Heiko Hirschmüller,et al.  Evaluation of Stereo Matching Costs on Images with Radiometric Differences , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Nikos Komodakis,et al.  Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.