Real-time accurate stereo with bitwise fast voting on CUDA

This paper proposes a real-time design for accurate stereo matching on Compute Unified Device Architecture (CUDA). We adopt a leading local algorithm for its high data parallelism. A GPU-oriented bitwise fast voting method is proposed to effectively improve the matching accuracy, which is enormously faster than the histogram-based approach. The whole algorithm is parallelized on CUDA at a fine granularity, efficiently exploiting the computing resources of GPUs. On-chip shared memory is utilized to alleviate the latency of memory accesses. Compared to the CPU counterpart, our design attains a speedup factor of 52. With high matching accuracy, the proposed design is still among the fastest stereo methods on GPUs. The advantages of speed and accuracy advocate our design for practical applications such as robotics systems and multiview teleconferencing.

[1]  Oge Marques,et al.  Stereo depth with a Unified Architecture GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[2]  Fatih Murat Porikli,et al.  Integral histogram: a fast way to extract histograms in Cartesian spaces , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Gauthier Lafruit,et al.  Cross-Based Local Stereo Matching Using Orthogonal Integral Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Miao Liao,et al.  Real-time Global Stereo Matching Using Hierarchical Belief Propagation , 2006, BMVC.

[5]  Wen-mei W. Hwu,et al.  Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[6]  Gauthier Lafruit,et al.  Real-time stereo matching: A cross-based local approach , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Miao Liao,et al.  High-Quality Real-Time Stereo Using Adaptive Cost Aggregation and Dynamic Programming , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[8]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[9]  Victor Podlozhnyuk,et al.  Image Convolution with CUDA , 2007 .

[10]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.