论文信息 - A two-level real-time vision machine combining coarse- and fine-grained parallelism

A two-level real-time vision machine combining coarse- and fine-grained parallelism

In this paper, we describe a real-time vision machine having a stereo camera as input generating visual information on two different levels of abstraction. The system provides visual low-level and mid-level information in terms of dense stereo and optical flow, egomotion, indicating areas with independently moving objects as well as a condensed geometric description of the scene. The system operates at more than 20 Hz using a hybrid architecture consisting of one dual-GPU card and one quad-core CPU. The different processing stages of visual information have rather different characteristics that in some cases make fine-grained parallelization on a GPU less applicable. However, for most of the stages that are not efficiently implementable on a GPU, a coarse parallelization on multiple CPU-cores is applicable. We show that with such hybrid parallelism, we can achieve a speed up of approximately a factor 90 and a reduction of latency of a factor 26 compared to processing on a single CPU-core. Since the vision machine provides generic visual information it can be used in many contexts. Currently it is used in a driver assistance context as well as in two robotic applications.

[1] Joachim M. Buhmann,et al. Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[2] Kevin Skadron,et al. Accelerating Compute-Intensive Applications with GPUs and FPGAs , 2008, 2008 Symposium on Application Specific Processors.

[3] David Vernon. The Space of Cognitive Vision , 2006, Cognitive Vision Systems.

[4] Peter Tröger,et al. The Multi-Core Era - Trends and Challenges , 2008, ArXiv.

[5] William B. Thompson,et al. Detecting moving objects , 1989, International Journal of Computer Vision.

[6] Florentin Wörgötter,et al. A cortical architecture on parallel hardware for motion processing in real time. , 2010, Journal of vision.

[7] Justus H. Piater,et al. A Probabilistic Framework for 3D Visual Object Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Gerhard Krieger,et al. Nonlinear mechanisms and higher-order statistics in biological vision and electronic image processing: review and perspectives , 2001, J. Electronic Imaging.

[9] Danica Kragic,et al. Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object-Action complexes , 2008, Int. J. Humanoid Robotics.

[10] Michael Felsberg,et al. The monogenic signal , 2001, IEEE Trans. Signal Process..

[11] Henrique S. Malvar,et al. High-quality linear interpolation for demosaicing of Bayer-patterned color images , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12] Geoffrey E. Hinton,et al. Learning Generative Texture Models with extended Fields-of-Experts , 2009, BMVC.

[13] J. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[14] Bernt Schiele,et al. Probabilistic object recognition using multidimensional receptive field histograms , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[15] Douglas C. Schmidt,et al. The Design and Performance of , 2003 .

[16] Michael Garland,et al. Designing efficient sorting algorithms for manycore GPUs , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[17] Michael Felsberg,et al. An explicit and compact coding of geometric and structural image information applied to stereo processing , 2004, Pattern Recognit. Lett..

[18] Fabio Solari,et al. Compact (and accurate) early vision processing in the harmonic space , 2007, VISAPP.

[19] Ron Sass,et al. Quantifying Effective Memory Bandwidth of Platform FPGAs , 2007 .

[20] Hans Knutsson,et al. Signal processing for computer vision , 1994 .

[21] Florentin Wörgötter,et al. Accumulated Visual Representation for Cognitive Vision , 2008, BMVC.

[22] Jeppe Barsøe Jessen. Real time sparse and dense stereo in an early cognitive vision system using CUDA Master ’ s thesis , 2009 .

[23] Barry Wilkinson. Computer architecture (2nd ed.): design and performance , 1996 .

[24] G. Amdhal,et al. Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[25] Refractor. Vision , 2000, The Lancet.

[26] Håkon Ording Bugge. An evaluation of Intel’s core i7 architecture using a comparative approach , 2009, Computer Science - Research and Development.

[27] Marc M. Van Hulle,et al. A phase-based approach to the estimation of the optical flow field using spatial filtering , 2002, IEEE Trans. Neural Networks.

[28] Karl Pauwels. Computational modeling of visual attention: neuronal response modulation in the thalamocortical complex and saliency-based detection of independent motion , 2008 .

[29] John W. Tukey,et al. Data Analysis and Regression: A Second Course in Statistics , 1977 .

[30] D. Hubel,et al. Anatomical Demonstration of Columns in the Monkey Striate Cortex , 1969, Nature.

[31] Danica Kragic,et al. Early reactive grasping with second order 3D feature relations , 2007 .

[32] Gösta H. Granlund,et al. The complexity of vision , 1999, Signal Process..

[33] Nicolas Pugeault,et al. Early cognitive vision: feedback mechanisms for the disambiguation of early visual representation , 2008 .

[34] Eduardo Ros,et al. A Hybrid FPGA/Coarse Parallel Processing Architecture for Multi-modal Visual Feature Descriptors , 2008, 2008 International Conference on Reconfigurable Computing and FPGAs.

[35] Edward H. Adelson,et al. PYRAMID METHODS IN IMAGE PROCESSING. , 1984 .

[36] D. Pollen,et al. Phase relationships between adjacent simple cells in the visual cortex. , 1981, Science.

[37] David J. Fleet,et al. Computation of component image velocity from local phase information , 1990, International Journal of Computer Vision.

[38] G. Granlund. In search of a general picture processing operator , 1978 .

[39] Marc M. Van Hulle,et al. Realtime phase-based optical flow on the GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[40] Norbert Krüger,et al. A Real-Time Embedded System for Stereo Vision Preprocessing Using an FPGA , 2008, 2008 International Conference on Reconfigurable Computing and FPGAs.

[41] Carlo Tomasi,et al. On the Consistency of Instantaneous Rigid Motion Estimation , 2004, International Journal of Computer Vision.

[42] Iain E. Garden Richardson,et al. Design and Performance , 2004 .

[43] Florentin Wörgötter,et al. Early Cognitive Vision: Using Gestalt-Laws for Task-Dependent, Active Image-Processing , 2004, Natural Computing.

[44] Markus Lappe,et al. Biologically Motivated Multi-modal Processing of Visual Primitives , 2003 .

[45] John Y. Aloimonos,et al. Unification and integration of visual modules: an extension of the Marr Paradigm , 1989 .

[46] Maya Gokhale,et al. Matched Filter Computation on FPGA, Cell and GPU , 2007, 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2007).

[47] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[48] Michael Felsberg,et al. Continuous dimensionality characterization of image structures , 2009, Image Vis. Comput..

[49] Soren W. Henriksen,et al. Manual of photogrammetry , 1980 .

[50] H. C. Longuet-Higgins,et al. The interpretation of a moving retinal image , 1980, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[51] Florentin Wörgötter,et al. Road Interpretation for Driver Assistance based on an Early Cognitive Vision System , 2009, VISAPP.

[52] Jean-Yves Bouguet,et al. Camera calibration toolbox for matlab , 2001 .