Unified Confidence Estimation Networks for Robust Stereo Matching

We present a deep architecture that estimates a stereo confidence, which is essential for improving the accuracy of stereo matching algorithms. In contrast to existing methods based on deep convolutional neural networks (CNNs) that rely on only one of the matching cost volume or estimated disparity map, our network estimates the stereo confidence by using the two heterogeneous inputs simultaneously. Specifically, the matching probability volume is first computed from the matching cost volume with residual networks and a pooling module in a manner that yields greater robustness. The confidence is then estimated through a unified deep network that combines confidence features extracted both from the matching probability volume and its corresponding disparity. In addition, our method extracts the confidence features of the disparity map by applying multiple convolutional filters with varying sizes to an input disparity map. To learn our networks in a semi-supervised manner, we propose a novel loss function that use confident points to compute the image reconstruction loss. To validate the effectiveness of our method in a disparity post-processing step, we employ three post-processing approaches; cost modulation, ground control points-based propagation, and aggregated ground control points-based propagation. Experimental results demonstrate that our method outperforms state-of-the-art confidence estimation methods on various benchmarks.

[1]  Alex Kendall,et al.  End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[3]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Kwanghoon Sohn,et al.  Cost Aggregation and Occlusion Handling With WLS in Stereo Matching , 2008, IEEE Transactions on Image Processing.

[5]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Lior Wolf,et al.  Improved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Seungryong Kim,et al.  Mahalanobis Distance Cross-Correlation for Illumination-Invariant Stereo Matching , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Geoffrey Egnal,et al.  A stereo confidence metric using single view imagery with comparison to five alternative approaches , 2004, Image Vis. Comput..

[9]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Luigi di Stefano,et al.  Learning confidence measures in the wild , 2017, BMVC.

[11]  Oisin Mac Aodha,et al.  Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Minh N. Do,et al.  Cross-based local multipoint filtering , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Sang Uk Lee,et al.  Robust Stereo Matching Using Adaptive Normalized Cross-Correlation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Stefano Mattoccia,et al.  Quantitative Evaluation of Confidence Measures in a Machine Learning World , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Yann LeCun,et al.  Computing the stereo matching cost with a convolutional neural network , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Xing Mei,et al.  On building an accurate stereo matching system on graphics hardware , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[17]  Marc Pollefeys,et al.  Patch Based Confidence Prediction for Dense Disparity Map , 2016, BMVC.

[18]  Li Xu,et al.  A sparse control model for image and video editing , 2013, ACM Trans. Graph..

[19]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[20]  Stefano Mattoccia,et al.  Learning to Predict Stereo Reliability Enforcing Local Consistency of Confidence Maps , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Philippos Mordohai,et al.  Correctness Prediction, Accuracy Improvement and Generalization of Stereo Matching Using Supervised Learning , 2015, International Journal of Computer Vision.

[22]  Xiaoyan Hu,et al.  Evaluation of stereo confidence indoors and outdoors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Philippos Mordohai,et al.  The Self-Aware Matching Measure for stereo , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Rahul Nair,et al.  Ensemble Learning for Confidence Measures in Stereo Vision , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Seungryong Kim,et al.  Feature Augmentation for Learning Confidence Measure in Stereo Matching , 2017, IEEE Transactions on Image Processing.

[26]  Jonathan M. Garibaldi,et al.  Real-Time Correlation-Based Stereo Vision with Reduced Border Errors , 2002, International Journal of Computer Vision.

[27]  Vladimir Kolmogorov,et al.  Visual correspondence using energy minimization and mutual information , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[28]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[29]  Nikos Komodakis,et al.  Learning to Detect Ground Control Points for Improving the Accuracy of Stereo Matching , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[31]  Seungryong Kim,et al.  Deep stereo confidence prediction for depth estimation , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[32]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[33]  In-So Kweon,et al.  Distinctive Similarity Measure for stereo matching under point ambiguity , 2008, Comput. Vis. Image Underst..

[34]  Ruigang Yang,et al.  Global stereo matching leveraged by sparse ground control points , 2011, CVPR 2011.

[35]  Luigi di Stefano,et al.  Unsupervised Adaptation for Deep Stereo , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Geoffrey Egnal,et al.  Detecting Binocular Half-Occlusions: Empirical Comparisons of Five Approaches , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Christopher Joseph Pal,et al.  Learning Conditional Random Fields for Stereo , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Narendra Ahuja,et al.  Deep Joint Image Filtering , 2016, ECCV.

[39]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Kuk-Jin Yoon,et al.  Leveraging stereo matching with learning-based confidence measures , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Minh N. Do,et al.  Fast Global Image Smoothing Based on Weighted Least Squares , 2014, IEEE Transactions on Image Processing.

[42]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Emanuele Trucco,et al.  Efficient stereo with multiple windowing , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Raquel Urtasun,et al.  Efficient Deep Learning for Stereo Matching , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Stefano Mattoccia,et al.  Learning from scratch a confidence measure , 2016, BMVC.

[46]  Stefano Mattoccia,et al.  Learning a General-Purpose Confidence Measure Based on O(1) Features and a Smarter Aggregation Strategy for Semi Global Matching , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[47]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[48]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.

[49]  Markus Vincze,et al.  A fast stereo matching algorithm suitable for embedded real-time systems , 2010, Comput. Vis. Image Underst..

[50]  Xiaoyan Hu,et al.  A Quantitative Evaluation of Confidence Measures for Stereo Vision , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.