Convolutional cost aggregation for robust stereo matching

Although convolutional neural network (CNN)-based stereo matching methods have become increasingly popular thanks to their robustness, they primarily have been focused on the matching cost computation. By leveraging CNNs, we present a novel method for matching cost aggregation to boost the stereo matching performance. Our insight is to learn the convolution kernel within CNN architecture for cost aggregation in a fully convolutional manner. Tailored to cost aggregation problem, our method differs from handcrafted methods in terms of its convolutional aggregation through optimally learned CNNs. First, the matching cost is aggregated with cost volume unary network, and then optimized with explicit disparity boundary, estimated through disparity boundary pairwise network, within a global energy minimization. Experiments demonstrate that our method outperforms conventional hand-crafted aggregation methods.

[1]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[2]  Yan Wang,et al.  DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Seungryong Kim,et al.  Mahalanobis Distance Cross-Correlation for Illumination-Invariant Stereo Matching , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Ruigang Yang,et al.  A Performance Study on Different Cost Aggregation Approaches Used in Real-Time Stereo Matching , 2007, International Journal of Computer Vision.

[5]  Raquel Urtasun,et al.  Efficient Deep Learning for Stereo Matching , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Carsten Rother,et al.  Fast Cost-Volume Filtering for Visual Correspondence and Beyond , 2013, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Gauthier Lafruit,et al.  Cross-Based Local Stereo Matching Using Orthogonal Integral Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jae Wook Jeon,et al.  Domain Transformation-Based Efficient Cost Aggregation for Local Stereo Matching , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Tamir Hazan,et al.  Continuous Markov Random Fields for Robust Stereo Estimation , 2012, ECCV.

[11]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Jian Sun,et al.  Guided Image Filtering , 2010, ECCV.

[13]  Liming Chen,et al.  Depth edge based trilateral filter method for stereo matching , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[14]  Jianbo Shi,et al.  DeepEdge: A multi-scale bifurcated deep network for top-down contour detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Liang Wang,et al.  A Deep Visual Correspondence Embedding Model for Stereo Matching Costs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[17]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Christopher Joseph Pal,et al.  Learning Conditional Random Fields for Stereo , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Chang-Su Kim,et al.  Adaptive smoothness constraints for efficient stereo matching using texture and edge information , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[21]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[22]  Minh N. Do,et al.  Probability-Based Rendering for View Synthesis , 2014, IEEE Transactions on Image Processing.

[23]  Kwanghoon Sohn,et al.  Cost Aggregation and Occlusion Handling With WLS in Stereo Matching , 2008, IEEE Transactions on Image Processing.

[24]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Takeo Kanade,et al.  A Cooperative Algorithm for Stereo Matching and Occlusion Detection , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Minh N. Do,et al.  Patch Match Filter: Efficient Edge-Aware Filtering Meets Randomized Search for Fast Correspondence Field Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Minh N. Do,et al.  Fast Global Image Smoothing Based on Weighted Least Squares , 2014, IEEE Transactions on Image Processing.

[28]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[29]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[31]  In-So Kweon,et al.  Adaptive Support-Weight Approach for Correspondence Search , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Manuel M. Oliveira,et al.  Domain transform for edge-aware image and video processing , 2011, SIGGRAPH 2011.