MTStereo 2.0: improved accuracy of stereo depth estimation withMax-trees

Efficient yet accurate extraction of depth from stereo image pairs is required by systems with low power resources, such as robotics and embedded systems. State-of-the-art stereo matching methods based on convolutional neural networks require intensive computations on GPUs and are difficult to deploy on embedded systems. In this paper, we propose a stereo matching method, called MTStereo 2.0, for limited-resource systems that require efficient and accurate depth estimation. It is based on a Max-tree hierarchical representation of image pairs, which we use to identify matching regions along image scan-lines. The method includes a cost function that considers similarity of region contextual information based on the Max-trees and a disparity border preserving cost aggregation approach. MTStereo 2.0 improves on its predecessor MTStereo 1.0 as it a) deploys a more robust cost function, b) performs more thorough detection of incorrect matches, c) computes disparity maps with pixel-level rather than node-level precision. MTStereo provides accurate sparse and semi-dense depth estimation and does not require intensive GPU computations like methods based on CNNs. Thus it can run on embedded and robotics devices with low-power requirements. We tested the proposed approach on several benchmark data sets, namely KITTI 2015, Driving, FlyingThings3D, Middlebury 2014, Monkaa and the TrimBot2020 garden data sets, and achieved competitive accuracy and efficiency. The code is available at this https URL.

[1]  Gauthier Lafruit,et al.  Cross-Based Local Stereo Matching Using Orthogonal Integral Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Jonathan T. Barron,et al.  Depth from motion for smartphone AR , 2018, ACM Trans. Graph..

[3]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[4]  Dongxiao Li,et al.  Fast stereo matching using adaptive guided filtering , 2014, Image Vis. Comput..

[5]  Narendra Ahuja,et al.  Two-view Matching , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[6]  Liming Chen,et al.  An improved Non-Local Cost Aggregation method for stereo matching based on color and boundary cue , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[7]  Narendra Ahuja,et al.  Region-Based Hierarchical Image Matching , 2008, International Journal of Computer Vision.

[8]  Jungwon Lee,et al.  AMNet: Deep Atrous Multiscale Stereo Disparity Estimation Networks , 2019, ArXiv.

[9]  Ruigang Yang,et al.  GA-Net: Guided Aggregation Net for End-To-End Stereo Matching , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Sebastian Ramos,et al.  Vision-Based Offline-Online Perception Paradigm for Autonomous Driving , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[11]  Changming Sun A Fast Stereo Matching Method , 1997 .

[12]  Wei Chen,et al.  Learning Deep Correspondence through Prior and Posterior Feature Constancy , 2017, ArXiv.

[13]  Marc Pollefeys,et al.  Reactive avoidance using embedded stereo vision for MAV flight , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Andreas Geiger,et al.  Efficient Large-Scale Stereo Matching , 2010, ACCV.

[16]  Andreas Zell,et al.  LS-ELAS: Line segment based efficient large scale stereo matching , 2017, ICRA 2017.

[17]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Ruigang Yang,et al.  Learning Depth with Convolutional Spatial Propagation Network , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Robert B. Fisher,et al.  TrimBot2020: an outdoor robot for automatic gardening , 2018, ArXiv.

[20]  Xiaogang Wang,et al.  Group-Wise Correlation Stereo Network , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Michael Suppa,et al.  Stereo vision based indoor/outdoor navigation for flying robots , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Philippe Salembier,et al.  Antiextensive connected operators for image and sequence processing , 1998, IEEE Trans. Image Process..

[23]  Andreas Zell,et al.  LS-ELAS: Line segment based efficient large scale stereo matching , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Nicolai Petkov,et al.  Efficient binocular stereo correspondence matching with 1-D Max-Trees , 2020, Pattern Recognit. Lett..

[25]  Michael H. F. Wilkinson A fast component-tree algorithm for high dynamic-range images and second generation connectivity , 2011, 2011 18th IEEE International Conference on Image Processing.

[26]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Xu Zhao,et al.  EdgeStereo: An Effective Multi-task Learning Network for Stereo Matching and Edge Detection , 2020, International Journal of Computer Vision.

[28]  Robert B. Fisher,et al.  The Second Workshop on 3D Reconstruction Meets Semantics: Challenge Results Discussion , 2018, ECCV Workshops.

[29]  R. D. Arnold Automated stereo perception , 1983 .

[30]  Ramakant Nevatia,et al.  Segment-based stereo matching , 1985, Comput. Vis. Graph. Image Process..

[31]  Christian Heipke,et al.  Joint 3d Estimation of Vehicles and Scene Flow , 2015 .

[32]  Alistair Sutherland,et al.  Disparity Estimation by Simultaneous Edge Drawing , 2016, ACCV Workshops.

[33]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[34]  Liang Wang,et al.  A Deep Visual Correspondence Embedding Model for Stereo Matching Costs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Robert B. Fisher,et al.  SDF-MAN: Semi-Supervised Disparity Fusion with Multi-Scale Adversarial Networks , 2018, Remote. Sens..

[36]  Laurent Vinet,et al.  Hierarchical region based stereo matching , 1989, Proceedings CVPR '89: IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Jörg Stückler,et al.  Large-scale direct SLAM with stereo cameras , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[38]  Martin Lauer,et al.  3D Traffic Scene Understanding From Movable Platforms , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Hongtao Lu,et al.  Fast Non-local Stereo Matching based on Hierarchical Disparity Prediction , 2015, ArXiv.

[40]  Alex Kendall,et al.  End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41]  Julian Eggert,et al.  A Two-Stage Correlation Method for Stereoscopic Depth Estimation , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.

[42]  In-So Kweon,et al.  Adaptive Support-Weight Approach for Correspondence Search , 2006, IEEE Trans. Pattern Anal. Mach. Intell..