PRECISION: A Reconfigurable SIMD/MIMD Coprocessor for Computer Vision Systems-on-Chip

Computer vision applications have a large disparity in operations, data representation and memory access patterns from the early vision stages to the final classification and recognition stages. A hardware system for computer vision has to provide high flexibility without compromising performance, exploiting massively spatial-parallel operations but also keeping a high throughput on data-dependent and complex program flows. Furthermore, the architecture must be modular, scalable and easy to adapt to the needs of different applications. Keeping this in mind, a hybrid SIMD/MIMD architecture for embedded computer vision is proposed. It consists of a coprocessor designed to provide fast and flexible computation of demanding image processing tasks of vision applications. A 32-bit 128-unit device was prototyped on a Virtex-6 FPGA which delivers a peak performance of 19.6 GOP/s and 7.2 W of power dissipation.

[1]  N. Alaraje,et al.  SoFPGA (Sysytem-on-FPGA) architecture: Performance analysis , 2007, 2007 IEEE International Conference on Electro/Information Technology.

[2]  Bin Wang,et al.  A 100,000 fps vision sensor with embedded 535GOPS/W 256×256 SIMD processor array , 2013, 2013 Symposium on VLSI Circuits.

[3]  Victor M. Brea,et al.  SIMD/MIMD Dynamically-Reconfigurable Architecture for High-Performance Embedded Vision Systems , 2012, 2012 IEEE 23rd International Conference on Application-Specific Systems, Architectures and Processors.

[4]  Michael R. Butts,et al.  A Structural Object Programming Model, Architecture, Chip and Tools for Reconfigurable Computing , 2007 .

[5]  Shorin Kyo,et al.  IMAPCAR: A 100 GOPS In-Vehicle Vision Processor Based on 128 Ring Connected Four-Way VLIW Processing Elements , 2011, J. Signal Process. Syst..

[6]  Yifan He,et al.  From Xetal-II to Xetal-Pro: On the Road Toward an Ultralow-Energy and High-Throughput SIMD Processor , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Tom Drummond,et al.  Binary Histogrammed Intensity Patches for Efficient and Robust Matching , 2011, International Journal of Computer Vision.

[8]  Ioannis Papaefstathiou,et al.  A Fast FPGA-Based 2-Opt Solver for Small-Scale Euclidean Traveling Salesman Problem , 2007 .

[9]  Youchang Kim,et al.  10.4 A 1.22TOPS and 1.52mW/MHz augmented reality multi-core processor with neural network NoC for HMD applications , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[10]  Lesley Shannon,et al.  The effect of node size, heterogeneity, and network size on FPGA based NoCs , 2009, 2009 International Conference on Field-Programmable Technology.

[11]  Martin Margala,et al.  A New Reconfigurable Coarse-Grain Architecture for Multimedia Applications , 2007, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007).

[12]  Liyuan Liu,et al.  A 1000 fps Vision Chip Based on a Dynamically Reconfigurable Hybrid Architecture Comprising a PE Array Processor and Self-Organizing Map Neural Network , 2014, IEEE Journal of Solid-State Circuits.

[13]  Paul Wasson,et al.  A Structural Object Programming Model, Architecture, Chip and Tools for Reconfigurable Computing , 2007, 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2007).

[14]  Andrea Prati,et al.  Image convolution on FPGAs: the implementation of a multi-FPGA FIFO structure , 1998, Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204).

[15]  Alejandro Nieto,et al.  Performance analysis of massively parallel embedded hardware architectures for retinal image processing , 2011, EURASIP J. Image Video Process..

[16]  Manuel G. Penedo,et al.  Fast retinal vessel tree extraction: A pixel parallel approach , 2008, Int. J. Circuit Theory Appl..

[17]  Victor M. Brea,et al.  FPGA-accelerated retinal vessel-tree extraction , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[18]  Luca Benini,et al.  Exploring architectural heterogeneity in intelligent vision systems , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[19]  Piotr Dudek Implementation of SIMD vision chip with 128/spl times/128 array of analogue processing elements , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[20]  Victor M. Brea,et al.  Feature detection and matching on an SIMD/MIMD hybrid embedded processor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[21]  Christopher W. Tyler Computer Vision: From Surfaces to 3D Objects , 2011 .

[22]  Russell Tessier,et al.  Reconfigurable Computing Architectures , 2015, Proceedings of the IEEE.

[23]  Alan Murray,et al.  An End-to-End Design Flow for Automated Instruction Set Extension and Complex Instruction Selection Based on GCC , 2009 .

[24]  Hongbo Zhu,et al.  A Real-Time Motion-Feature-Extraction VLSI Employing Digital-Pixel-Sensor-Based Parallel Architecture , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Youchang Kim,et al.  A 1.22 TOPS and 1.52 mW/MHz Augmented Reality Multicore Processor With Neural Network NoC for HMD Applications , 2015, IEEE Journal of Solid-State Circuits.