Feature Augmentation for Learning Confidence Measure in Stereo Matching

Confidence estimation is essential for refining stereo matching results through a post-processing step. This problem has recently been studied using a learning-based approach, which demonstrates a substantial improvement on conventional simple non-learning based methods. However, the formulation of learning-based methods that individually estimates the confidence of each pixel disregards spatial coherency that might exist in the confidence map, thus providing a limited performance under challenging conditions. Our key observation is that the confidence features and resulting confidence maps are smoothly varying in the spatial domain, and highly correlated within the local regions of an image. We present a new approach that imposes spatial consistency on the confidence estimation. Specifically, a set of robust confidence features is extracted from each superpixel decomposed using the Gaussian mixture model, and then these features are concatenated with pixel-level confidence features. The features are then enhanced through adaptive filtering in the feature domain. In addition, the resulting confidence map, estimated using the confidence features with a random regression forest, is further improved through K-nearest neighbor based aggregation scheme on both pixel- and superpixel-level. To validate the proposed confidence estimation scheme, we employ cost modulation or ground control points based optimization in stereo matching. Experimental results demonstrate that the proposed method outperforms state-of-the-art approaches on various benchmarks including challenging outdoor scenes.

[1]  D. Zhang,et al.  Principle Component Analysis , 2004 .

[2]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Yann LeCun,et al.  Computing the stereo matching cost with a convolutional neural network , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Marc Pollefeys,et al.  Patch Based Confidence Prediction for Dense Disparity Map , 2016, BMVC.

[6]  Jonathan M. Garibaldi,et al.  Real-Time Correlation-Based Stereo Vision with Reduced Border Errors , 2002, International Journal of Computer Vision.

[7]  Ramin Zabih,et al.  Dynamic Programming and Graph Algorithms in Computer Vision , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Kwanghoon Sohn,et al.  Cost Aggregation and Occlusion Handling With WLS in Stereo Matching , 2008, IEEE Transactions on Image Processing.

[10]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[11]  L. Jost Entropy and diversity , 2006 .

[12]  Nikos Komodakis,et al.  Learning to Detect Ground Control Points for Improving the Accuracy of Stereo Matching , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Gauthier Lafruit,et al.  Cross-Based Local Stereo Matching Using Orthogonal Integral Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  C. Lawrence Zitnick,et al.  Structured Forests for Fast Edge Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  Xing Mei,et al.  On building an accurate stereo matching system on graphics hardware , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[18]  I. Wald,et al.  On building fast kd-Trees for Ray Tracing, and on doing that in O(N log N) , 2006, 2006 IEEE Symposium on Interactive Ray Tracing.

[19]  Stefano Mattoccia,et al.  Learning from scratch a confidence measure , 2016, BMVC.

[20]  Stefano Mattoccia,et al.  Learning a General-Purpose Confidence Measure Based on O(1) Features and a Smarter Aggregation Strategy for Semi Global Matching , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[21]  Emanuele Trucco,et al.  Efficient stereo with multiple windowing , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Markus Vincze,et al.  A fast stereo matching algorithm suitable for embedded real-time systems , 2010, Comput. Vis. Image Underst..

[23]  Jian Sun,et al.  Guided Image Filtering , 2010, ECCV.

[24]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[25]  Hai Tao,et al.  A method for learning matching errors for stereo computation , 2004, BMVC.

[26]  Sang Uk Lee,et al.  Robust Stereo Matching Using Adaptive Normalized Cross-Correlation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[28]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[30]  Laurent Moll,et al.  Real time correlation-based stereo: algorithm, implementations and applications , 1993 .

[31]  In-So Kweon,et al.  Distinctive Similarity Measure for stereo matching under point ambiguity , 2008, Comput. Vis. Image Underst..

[32]  Ruigang Yang,et al.  Global stereo matching leveraged by sparse ground control points , 2011, CVPR 2011.

[33]  U. Grömping Dependence of Variable Importance in Random Forests on the Shape of the Regressor Space , 2009 .

[34]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[35]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[36]  Xiaoyan Hu,et al.  A Quantitative Evaluation of Confidence Measures for Stereo Vision , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[38]  Kuk-Jin Yoon,et al.  Leveraging stereo matching with learning-based confidence measures , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Manuel Menezes de Oliveira Neto,et al.  Adaptive manifolds for real-time high-dimensional filtering , 2012, ACM Trans. Graph..

[40]  H. Deutsch Principle Component Analysis , 2004 .

[41]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[42]  Geoffrey Egnal,et al.  Detecting Binocular Half-Occlusions: Empirical Comparisons of Five Approaches , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Raquel Urtasun,et al.  Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation , 2014, ECCV.

[44]  Pedro F. Felzenszwalb,et al.  Efficient belief propagation for early vision , 2004, CVPR 2004.

[45]  Rahul Nair,et al.  Ensemble Learning for Confidence Measures in Stereo Vision , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[47]  Seungryong Kim,et al.  Mahalanobis Distance Cross-Correlation for Illumination-Invariant Stereo Matching , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[48]  Philippos Mordohai,et al.  Confidence Estimation for Superpixel-Based Stereo Matching , 2015, 2015 International Conference on 3D Vision.

[49]  Stefano Mattoccia,et al.  Learning to Predict Stereo Reliability Enforcing Local Consistency of Confidence Maps , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Vladimir Kolmogorov,et al.  Visual correspondence using energy minimization and mutual information , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[51]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Li Xu,et al.  A sparse control model for image and video editing , 2013, ACM Trans. Graph..

[53]  Stefan K. Gehrig,et al.  Exploiting the Power of Stereo Confidences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Geoffrey Egnal,et al.  A stereo confidence metric using single view imagery with comparison to five alternative approaches , 2004, Image Vis. Comput..

[55]  Philippos Mordohai,et al.  The Self-Aware Matching Measure for stereo , 2009, 2009 IEEE 12th International Conference on Computer Vision.