Architectures for Stereo Vision

Stereo vision is an elementary problem for many computer vision tasks. It has been widely studied under the two aspects of increasing the quality of the results and accelerating the computational processes. This chapter provides theoretic background on stereo vision systems and discusses architectures and implementations for real-time applications. In particular, the computationally most intensive part, the stereo matching, is discussed on the example of one of the leading algorithms, the semi-global matching (SGM). For this algorithm two implementations are presented in detail on two of the most relevant platforms for real-time image processing today: Field Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPUs). Thus, the major differences in designing parallelization techniques for extremely different image processing platforms are being illustrated.

[1]  Marc Levoy,et al.  Reconstructing Occluded Surfaces Using Synthetic Apertures: Stereo, Focus and Robust Measures , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  M.C. Kung,et al.  Block based parallel motion estimation using programmable graphics hardware , 2008, 2008 International Conference on Audio, Language and Image Processing.

[3]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[4]  F. Giesemann,et al.  VLIW architecture optimization for an efficient computation of stereoscopic video applications , 2010, The 2010 International Conference on Green Circuits and Systems.

[5]  Amir Fijany,et al.  Fast implementation of dense stereo vision algorithms on a highly parallel SIMD architecture , 2011, Journal of Real-Time Image Processing.

[6]  Kurt Konolige,et al.  Small Vision Systems: Hardware and Implementation , 1998 .

[7]  Richard Szeliski,et al.  Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.

[8]  Stefan K. Gehrig,et al.  A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching , 2009, ICVS.

[9]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[10]  Gauthier Lafruit,et al.  Cross-Based Local Stereo Matching Using Orthogonal Integral Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[12]  Linda G. Shapiro,et al.  Computer and Robot Vision , 1991 .

[13]  H. Hirschmüller Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information , 2005, CVPR.

[14]  Ruigang Yang,et al.  Image-gradient-guided real-time stereo on graphics hardware , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[15]  Laurent Moll,et al.  Real time correlation-based stereo: algorithm, implementations and applications , 1993 .

[16]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Tian-Sheuan Chang,et al.  Algorithm and Architecture of Disparity Estimation With Mini-Census Adaptive Support Weight , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Jae Wook Jeon,et al.  FPGA Design and Implementation of a Real-Time Stereo Vision System , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Hong Jeong,et al.  Real-time Stereo Vision FPGA Chip with Low Error Rate , 2007, 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07).

[20]  Clemens Rabe,et al.  Real-time Semi-Global Matching on the CPU , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[21]  W. James MacLean,et al.  A Real-Time Large Disparity Range Stereo-System using FPGAs , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[22]  Yann LeCun,et al.  Computing the stereo matching cost with a convolutional neural network , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  W. James MacLean,et al.  Reconfigurable hardware implementation of a phase-correlation stereoalgorithm , 2006, Machine Vision and Applications.

[24]  Gauthier Lafruit,et al.  Real-Time Stereo Correspondence using a Truncated Separable Laplacian Kernel Approximation on Graphics Hardware , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[25]  Ioannis Andreadis,et al.  A real-time fuzzy hardware structure for disparity map computation , 2011, Journal of Real-Time Image Processing.

[26]  Gauthier Lafruit,et al.  Real-time stereo matching: A cross-based local approach , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  Sergiu Nedevschi,et al.  GPU optimization of the SGM stereo algorithm , 2010, Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing.

[28]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Xing Mei,et al.  On building an accurate stereo matching system on graphics hardware , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[30]  Daniel Cremers,et al.  Efficient Dense Scene Flow from Sparse or Dense Stereo Data , 2008, ECCV.

[31]  Peter Pirsch,et al.  Evaluation of Penalty Functions for Semi-Global Matching Cost Aggregation , 2012 .

[32]  John Woodfill,et al.  Real-time stereo vision on the PARTS reconfigurable computer , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[33]  Sergiu Nedevschi,et al.  SORT-SGM: Subpixel Optimized Real-Time Semiglobal Matching for Intelligent Vehicles , 2012, IEEE Transactions on Vehicular Technology.

[34]  Vladimir Kolmogorov,et al.  Visual correspondence using energy minimization and mutual information , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[35]  Peter Pirsch,et al.  Architectures for digital signal processing , 1998 .

[36]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Holger Blume,et al.  Instruction set extension for high throughput disparity estimation in stereo image processing , 2011, ASAP 2011 - 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors.

[39]  Federico Tombari,et al.  Classification and evaluation of cost aggregation methods for stereo correspondence , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[41]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[42]  Tsutomu Maruyama,et al.  A Real-Time Stereo Vision System with FPGA , 2003, FPL.

[43]  Ruigang Yang,et al.  Real-time consensus-based scene reconstruction using commodity graphics hardware , 2002, 10th Pacific Conference on Computer Graphics and Applications, 2002. Proceedings..

[44]  Ruigang Yang,et al.  A Performance Study on Different Cost Aggregation Approaches Used in Real-Time Stereo Matching , 2007, International Journal of Computer Vision.

[45]  James M. Coughlan,et al.  Elevation-based MRF stereo implemented in real-time on a GPU , 2009, 2009 Workshop on Applications of Computer Vision (WACV).

[46]  Donald G. Bailey,et al.  Design for Embedded Image Processing on FPGAs: Bailey/Design for Embedded Image Processing on FPGAs , 2011 .

[47]  Heiko Hirschmüller,et al.  Evaluation of Stereo Matching Costs on Images with Radiometric Differences , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  JunSeong Kim,et al.  Real-Time Stereo Vision on a Reconfigurable System , 2005, SAMOS.

[49]  Hui Chen,et al.  Belief Propagation Implementation Using CUDA on an NVIDIA GTX 280 , 2009, Australasian Conference on Artificial Intelligence.

[50]  J.M. Perez,et al.  High memory throughput FPGA architecture for high-definition Belief-Propagation stereo matching , 2009, 2009 3rd International Conference on Signals, Circuits and Systems (SCS).

[51]  Luc Van Gool,et al.  A mobile vision system for robust multi-person tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Alan Brunton,et al.  Belief Propagation on the GPU for Stereo Vision , 2006, The 3rd Canadian Conference on Computer and Robot Vision (CRV'06).

[53]  Peter Pirsch,et al.  A Multi-Shared Register File Structure for VLIW Processors , 2010, J. Signal Process. Syst..

[54]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[55]  Georgy Gimel'farb,et al.  Performance analysis of multi-resolution symmetric dynamic programming stereo on GPU , 2010, 2010 25th International Conference of Image and Vision Computing New Zealand.

[56]  Rudy Lauwereins,et al.  Real-Time and Accurate Stereo: A Scalable Approach With Bitwise Fast Voting on CUDA , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[57]  Daniel Cremers,et al.  A Convex Formulation of Continuous Multi-label Problems , 2008, ECCV.

[58]  Peter Pirsch,et al.  Real-time stereo vision system using semi-global matching disparity estimation: Architecture and FPGA-implementation , 2010, 2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[59]  Tian-Sheuan Chang,et al.  Architecture Design of Belief Propagation for Real-Time Disparity Estimation , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[60]  Larry Matthies,et al.  FPGA implementation of stereo disparity with high throughput for mobility applications , 2011, 2011 Aerospace Conference.

[61]  John Iselin Woodfill,et al.  The Tyzx DeepSea G2 Vision System, ATaskable, Embedded Stereo Camera , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[62]  Raquel Urtasun,et al.  Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation , 2014, ECCV.

[63]  Georgy Gimel'farb,et al.  Breaking the ‘Ton’: Achieving 1% depth accuracy from stereo in real time , 2009, 2009 24th International Conference Image and Vision Computing New Zealand.

[64]  Carsten Rother,et al.  REal-time local stereo matching using guided image filtering , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[65]  A. Fijany,et al.  Highly parallel and fast implementation of stereo vision algorithms on MIMD many-core Tilera architecture , 2012, 2012 IEEE Aerospace Conference.

[66]  Tian-Sheuan Chang,et al.  Real-Time DSP Implementation on Local Stereo Matching , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[67]  Ines Ernst,et al.  Mutual Information Based Semi-Global Stereo Matching on the GPU , 2008, ISVC.

[68]  Kristian Ambrosch,et al.  Accurate hardware-based stereo vision , 2010, Comput. Vis. Image Underst..

[69]  Ingemar J. Cox,et al.  A maximum-flow formulation of the N-camera stereo correspondence problem , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[70]  Sergiu Nedevschi,et al.  Design of Interpolation Functions for Subpixel-Accuracy Stereo-Vision Systems , 2012, IEEE Transactions on Image Processing.

[71]  Pong P. Chu RTL Hardware Design Using VHDL: Coding for Efficiency, Portability, and Scalability , 2006 .

[72]  Theocharis Theocharides,et al.  Edge-Directed Hardware Architecture for Real-Time Disparity Map Computation , 2013, IEEE Transactions on Computers.

[73]  Eduardo Ros,et al.  Massive Parallel-Hardware Architecture for Multiscale Stereo, Optical Flow and Image-Structure Computation , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[74]  Peter N. Belhumeur,et al.  A Bayesian approach to binocular steropsis , 1996, International Journal of Computer Vision.

[75]  Miao Liao,et al.  Real-time Global Stereo Matching Using Hierarchical Belief Propagation , 2006, BMVC.

[76]  Minglun Gong,et al.  Near real-time reliable stereo matching using programmable graphics hardware , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[77]  Ruigang Yang,et al.  Multi-resolution real-time stereo on commodity graphics hardware , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[78]  W. Kubinger,et al.  Performance evaluation of a census-based stereo matching algorithm on embedded and multi-core hardware , 2009, 2009 Proceedings of 6th International Symposium on Image and Signal Processing and Analysis.

[79]  Markus Hadwiger,et al.  Accurate Dense Stereo Reconstruction using Graphics Hardware , 2003, Eurographics.

[80]  Sergio Bampi,et al.  Multi-level pipelined parallel hardware architecture for high throughput motion and disparity estimation in Multiview Video Coding , 2011, 2011 Design, Automation & Test in Europe.

[81]  Zhengyou Zhang,et al.  Determining the Epipolar Geometry and its Uncertainty: A Review , 1998, International Journal of Computer Vision.

[82]  Alberto Prieto,et al.  Real-Time System for High-Image Resolution Disparity Estimation , 2007, IEEE Transactions on Image Processing.

[83]  Martin Humenberger,et al.  A very fast census-based stereo matching implementation on a graphics processing unit , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[84]  Richard Szeliski,et al.  Sampling the disparity space image , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[85]  Miguel Arias-Estrada,et al.  Multiple Stereo Matching Using an Extended Architecture , 2001, FPL.

[86]  Takeo Kanade,et al.  A stereo machine for video-rate dense depth mapping and its new applications , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[87]  John Morris,et al.  Comparison of FPGA and GPU implementations of real-time stereo vision , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[88]  Peter Pirsch,et al.  Real-time semi-global matching disparity estimation on the GPU , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[89]  Oge Marques,et al.  Stereo depth with a Unified Architecture GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[90]  Christopher Zach,et al.  Scanline Optimization for Stereo on Graphics Hardware , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[91]  Liang-Gee Chen,et al.  Hardware-Efficient Belief Propagation , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[92]  W. James MacLean,et al.  A Real-Time Large Disparity Range Stereo-System Using FPGAs , 2006, ACCV.

[93]  Yunde Jia,et al.  A miniature stereo vision machine (MSVM-III) for dense disparity mapping , 2004, ICPR 2004.

[94]  Stefano Mattoccia,et al.  Near real-time Fast Bilateral Stereo on the GPU , 2011, CVPR 2011 WORKSHOPS.

[95]  S. Sabihuddin,et al.  Dynamic programming approach to high frame-rate stereo correspondence: A pipelined architecture implemented on a field programmable gate array , 2008, 2008 Canadian Conference on Electrical and Computer Engineering.

[96]  Luc Van Gool,et al.  Real-time connectivity constrained depth map computation using programmable graphics hardware , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[97]  Miao Liao,et al.  High-Quality Real-Time Stereo Using Adaptive Cost Aggregation and Dynamic Programming , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[98]  Dariu Gavrila,et al.  Real-time dense stereo for intelligent vehicles , 2006, IEEE Transactions on Intelligent Transportation Systems.

[99]  Ruigang Yang,et al.  How Far Can We Go with Local Optimization in Real-Time Stereo Matching , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[100]  Timo Schönwald,et al.  Parallel matching-based estimation - a case study on three different hardware architectures , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[101]  Carlo Tomasi,et al.  A Pixel Dissimilarity Measure That Is Insensitive to Image Sampling , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[102]  Victor Podlozhnyuk,et al.  Image Convolution with CUDA , 2007 .

[103]  Donald G. Bailey,et al.  Design for Embedded Image Processing on FPGAs , 2011 .

[104]  John Iselin Woodfill,et al.  Tyzx DeepSea High Speed Stereo Vision System , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[105]  Alfred Schmitt,et al.  Real-Time Stereo by using Dynamic Programming , 2003, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[106]  Theocharis Theocharides,et al.  Towards accurate hardware stereo correspondence: A real-time FPGA implementation of a segmentation-based adaptive support weight algorithm , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[107]  Minglun Gong Real-time joint disparity and disparity flow estimation on programmable graphics hardware , 2009, Comput. Vis. Image Underst..

[108]  Reinhard Männer,et al.  Calculating Dense Disparity Maps from Color Stereo Images, an Efficient Implementation , 2004, International Journal of Computer Vision.