论文信息 - PRECISION: A Reconfigurable SIMD/MIMD Coprocessor for Computer Vision Systems-on-Chip

PRECISION: A Reconfigurable SIMD/MIMD Coprocessor for Computer Vision Systems-on-Chip

Computer vision applications have a large disparity in operations, data representation and memory access patterns from the early vision stages to the final classification and recognition stages. A hardware system for computer vision has to provide high flexibility without compromising performance, exploiting massively spatial-parallel operations but also keeping a high throughput on data-dependent and complex program flows. Furthermore, the architecture must be modular, scalable and easy to adapt to the needs of different applications. Keeping this in mind, a hybrid SIMD/MIMD architecture for embedded computer vision is proposed. It consists of a coprocessor designed to provide fast and flexible computation of demanding image processing tasks of vision applications. A 32-bit 128-unit device was prototyped on a Virtex-6 FPGA which delivers a peak performance of 19.6 GOP/s and 7.2 W of power dissipation.

Victor M. Brea | David López Vilariño | Alejandro Nieto

[1] N. Alaraje,et al. SoFPGA (Sysytem-on-FPGA) architecture: Performance analysis , 2007, 2007 IEEE International Conference on Electro/Information Technology.

[2] Bin Wang,et al. A 100,000 fps vision sensor with embedded 535GOPS/W 256×256 SIMD processor array , 2013, 2013 Symposium on VLSI Circuits.

[3] Victor M. Brea,et al. SIMD/MIMD Dynamically-Reconfigurable Architecture for High-Performance Embedded Vision Systems , 2012, 2012 IEEE 23rd International Conference on Application-Specific Systems, Architectures and Processors.

[4] Michael R. Butts,et al. A Structural Object Programming Model, Architecture, Chip and Tools for Reconfigurable Computing , 2007 .

[5] Shorin Kyo,et al. IMAPCAR: A 100 GOPS In-Vehicle Vision Processor Based on 128 Ring Connected Four-Way VLIW Processing Elements , 2011, J. Signal Process. Syst..

[6] Yifan He,et al. From Xetal-II to Xetal-Pro: On the Road Toward an Ultralow-Energy and High-Throughput SIMD Processor , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[7] Tom Drummond,et al. Binary Histogrammed Intensity Patches for Efficient and Robust Matching , 2011, International Journal of Computer Vision.

[8] Ioannis Papaefstathiou,et al. A Fast FPGA-Based 2-Opt Solver for Small-Scale Euclidean Traveling Salesman Problem , 2007 .

[9] Youchang Kim,et al. 10.4 A 1.22TOPS and 1.52mW/MHz augmented reality multi-core processor with neural network NoC for HMD applications , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[10] Lesley Shannon,et al. The effect of node size, heterogeneity, and network size on FPGA based NoCs , 2009, 2009 International Conference on Field-Programmable Technology.

[11] Martin Margala,et al. A New Reconfigurable Coarse-Grain Architecture for Multimedia Applications , 2007, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007).

[12] Liyuan Liu,et al. A 1000 fps Vision Chip Based on a Dynamically Reconfigurable Hybrid Architecture Comprising a PE Array Processor and Self-Organizing Map Neural Network , 2014, IEEE Journal of Solid-State Circuits.

[13] Paul Wasson,et al. A Structural Object Programming Model, Architecture, Chip and Tools for Reconfigurable Computing , 2007, 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2007).

[14] Andrea Prati,et al. Image convolution on FPGAs: the implementation of a multi-FPGA FIFO structure , 1998, Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204).

[15] Alejandro Nieto,et al. Performance analysis of massively parallel embedded hardware architectures for retinal image processing , 2011, EURASIP J. Image Video Process..

[16] Manuel G. Penedo,et al. Fast retinal vessel tree extraction: A pixel parallel approach , 2008, Int. J. Circuit Theory Appl..

[17] Victor M. Brea,et al. FPGA-accelerated retinal vessel-tree extraction , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[18] Luca Benini,et al. Exploring architectural heterogeneity in intelligent vision systems , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[19] Piotr Dudek. Implementation of SIMD vision chip with 128/spl times/128 array of analogue processing elements , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[20] Victor M. Brea,et al. Feature detection and matching on an SIMD/MIMD hybrid embedded processor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[21] Christopher W. Tyler. Computer Vision: From Surfaces to 3D Objects , 2011 .

[22] Russell Tessier,et al. Reconfigurable Computing Architectures , 2015, Proceedings of the IEEE.

[23] Alan Murray,et al. An End-to-End Design Flow for Automated Instruction Set Extension and Complex Instruction Selection Based on GCC , 2009 .

[24] Hongbo Zhu,et al. A Real-Time Motion-Feature-Extraction VLSI Employing Digital-Pixel-Sensor-Based Parallel Architecture , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[25] Youchang Kim,et al. A 1.22 TOPS and 1.52 mW/MHz Augmented Reality Multicore Processor With Neural Network NoC for HMD Applications , 2015, IEEE Journal of Solid-State Circuits.