Real-Time and Accurate Stereo: A Scalable Approach With Bitwise Fast Voting on CUDA

This paper proposes a real-time design for accurate stereo matching on compute unified device architecture (CUDA). We present a leading local algorithm and then accelerate it by parallel computing. High matching accuracy is achieved by cost aggregation over shape-adaptive support regions and disparity refinement using reliable initial estimates. A novel sample-and-restore scheme is proposed to make the algorithm scalable, capable of attaining several times speedup at the expense of minor accuracy degradation. The refinement and the restoration are jointly realized by a local voting method. To accelerate the voting on CUDA, a graphics processing unit (GPU)-oriented bitwise fast voting method is proposed, faster than the traditional histogram-based approach with two orders of magnitude. The whole algorithm is parallelized on CUDA at a fine granularity, efficiently exploiting the computing resources of GPUs. Our design is among the fastest stereo matching methods on GPUs. Evaluated in the Middlebury stereo benchmark, the proposed design produces the most accurate results among the real-time methods. The advantages of speed, accuracy, and desirable scalability advocate our design for practical applications such as robotics systems and multiview teleconferencing.

[1]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[2]  Gauthier Lafruit,et al.  Real-time stereo matching: A cross-based local approach , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Ruigang Yang,et al.  Image-gradient-guided real-time stereo on graphics hardware , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[4]  Miao Liao,et al.  High-Quality Real-Time Stereo Using Adaptive Cost Aggregation and Dynamic Programming , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[5]  Markus Hadwiger,et al.  Accurate Dense Stereo Reconstruction using Graphics Hardware , 2003, Eurographics.

[6]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Oge Marques,et al.  Stereo depth with a Unified Architecture GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[8]  Jonathan M. Garibaldi,et al.  Real-Time Correlation-Based Stereo Vision with Reduced Border Errors , 2002, International Journal of Computer Vision.

[9]  Gauthier Lafruit,et al.  Cross-Based Local Stereo Matching Using Orthogonal Integral Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Gauthier Lafruit,et al.  Stream-Centric Stereo Matching and View Synthesis: A High-Speed Approach on GPUs , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[12]  Olga Veksler,et al.  Fast variable window for stereo correspondence using integral images , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[13]  Victor Podlozhnyuk,et al.  Histogram calculation in CUDA, NVIDIA GPU Computing SDK CUDA Advanced Topics Whitepaper , 2007 .

[14]  Rodney A. Kennedy,et al.  Efficient Histogram Algorithms for NVIDIA CUDA Compatible Devices , 2007 .

[15]  Fatih Murat Porikli,et al.  Integral histogram: a fast way to extract histograms in Cartesian spaces , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Rudy Lauwereins,et al.  Real-time accurate stereo with bitwise fast voting on CUDA , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[17]  Gauthier Lafruit,et al.  Scalable stereo matching with Locally Adaptive Polygon Approximation , 2008, 2008 15th IEEE International Conference on Image Processing.

[18]  Wen-mei W. Hwu,et al.  Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[19]  Gauthier Lafruit,et al.  Anisotropic local high-confidence voting for accurate stereo correspondence , 2008, Electronic Imaging.

[20]  Federico Tombari,et al.  Classification and evaluation of cost aggregation methods for stereo correspondence , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Luc Van Gool,et al.  Real-time connectivity constrained depth map computation using programmable graphics hardware , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Victor Podlozhnyuk,et al.  Image Convolution with CUDA , 2007 .

[23]  In-So Kweon,et al.  Adaptive Support-Weight Approach for Correspondence Search , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Ruigang Yang,et al.  Improved Real-Time Stereo on Commodity Graphics Hardware , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[26]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[27]  Miao Liao,et al.  Real-time Global Stereo Matching Using Hierarchical Belief Propagation , 2006, BMVC.