论文信息 - HARP: Hierarchical Attention Oriented Region-Based Processing for High-Performance Computation in Vision Sensor

HARP: Hierarchical Attention Oriented Region-Based Processing for High-Performance Computation in Vision Sensor

Cameras are widely adopted for high image quality with the rapid advancement of complementary metal-oxide-semiconductor (CMOS) image sensors while offloading vision applications’ computation to the cloud. It raises concern for time-critical applications such as autonomous driving, surveillance, and defense systems since moving pixels from the sensor’s focal plane are expensive. This paper presents a hardware architecture for smart cameras that understands the salient regions from an image frame and then performs high-level inference computation for sensor-level information creation instead of transporting raw pixels. A visual attention-oriented computational strategy helps to filter a significant amount of redundant spatiotemporal data collected at the focal plane. A computationally expensive learning model is then applied to the interesting regions of the image. The hierarchical processing in the pixels’ data path demonstrates a bottom-up architecture with massive parallelism and gives high throughput by exploiting the large bandwidth available at the image source. We prototype the model in field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) for integrating with a pixel-parallel image sensor. The experiment results show that our approach achieves significant speedup while in certain conditions exhibits up to 45% more energy efficiency with the attention-oriented processing. Although there is an area overhead for inheriting attention-oriented processing, the achieved performance based on energy consumption, latency, and memory utilization overcomes that limitation.

Christophe Bobda | Md Jubaer Hossain Pantho | Pankaj Bhowmik | C. Bobda | Pankaj Bhowmik

[1] Fernando Pardo,et al. Selective Change Driven Imaging: A Biomimetic Visual Sensing Strategy , 2011, Sensors.

[2] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[3] Shinya Miyata,et al. A back-illuminated global-shutter CMOS image sensor with pixel-parallel 14b subthreshold ADC , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[4] Stefania Perri,et al. Energy-Efficient Architecture for CNNs Inference on Heterogeneous FPGA , 2019 .

[5] Piotr Dudek,et al. Fully Embedding Fast Convolutional Networks on Pixel Processor Arrays , 2020, ECCV.

[6] Giorgio Bonmassar,et al. Space-variant active vision: Definition, overview and examples , 1995, Neural Networks.

[7] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[8] Jason Cong,et al. Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9] Feng Wu,et al. Energy-efficient and high-throughput FPGA-based accelerator for Convolutional Neural Networks , 2016, 2016 13th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT).

[10] Hiroshi Toshiyoshi,et al. Pixel-Parallel 3-D Integrated CMOS Image Sensors With Pulse Frequency Modulation A/D Converters Developed by Direct Bonding of SOI Layers , 2015, IEEE Transactions on Electron Devices.

[11] Piotr Dudek,et al. Scamp5d Vision System and Development Framework , 2018, ICDSC.

[12] Christophe Bobda,et al. Bio-inspired smart vision sensor: toward a reconfigurable hardware modeling of the hierarchical processing in the brain , 2020, Journal of Real-Time Image Processing.

[13] Tobias Delbrück,et al. Frame-free dynamic digital vision , 2008 .

[14] Chan H. See,et al. Accelerating Retinal Fundus Image Classification Using Artificial Neural Networks (ANNs) and Reconfigurable Hardware (FPGA) , 2019 .

[15] Jie Xu,et al. DeepBurning: Automatic generation of FPGA-based learning accelerators for the Neural Network family , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[16] Paulo Da Cunha Possa,et al. P2IP: A novel low-latency Programmable Pipeline Image Processor , 2015, Microprocess. Microsystems.

[17] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18] Xinqiao Liu,et al. A 10000 frames/s CMOS digital pixel sensor , 2001, IEEE J. Solid State Circuits.

[19] Piotr Dudek,et al. A Camera That CNNs: Towards Embedded Neural Networks on Pixel Processor Arrays , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20] Klaus Kofler,et al. Performance and Scalability of GPU-Based Convolutional Neural Networks , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[21] A. Davison,et al. Camera Tracking on Focal-Plane Sensor-Processor Arrays , 2019 .

[22] Chiara Bartolozzi,et al. Event-Based Vision: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] Stefan Roth,et al. MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking , 2015, ArXiv.

[24] Yu Cao,et al. Optimizing the Convolution Operation to Accelerate Deep Neural Networks on FPGA , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[25] Hong Wang,et al. Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[26] Wenyuan Lu,et al. Laius: An 8-Bit Fixed-Point CNN Hardware Inference Engine , 2017, 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC).

[27] Saibal Mukhopadhyay,et al. Edge-Host Partitioning of Deep Neural Networks with Feature Space Encoding for Resource-Constrained Internet-of-Things Platforms , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[28] Krishnendu Chakrabarty,et al. Detection, Diagnosis, and Recovery From Clock-Domain Crossing Failures in Multiclock SoCs , 2013, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[29] Christophe Bobda,et al. Distributed Embedded Smart Cameras: Architectures, Design and Applications , 2014 .

[30] Christophe Bobda,et al. Visual Cortex Inspired Pixel-Level Re-configurable Processors for Smart Image Sensors , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[31] Roman A. Solovyev,et al. FPGA Implementation of Convolutional Neural Networks with Fixed-Point Calculations , 2018, ArXiv.

[32] Saibal Mukhopadhyay,et al. A Spatiotemporal Pre-processing Network for Activity Recognition under Rain , 2019, BMVC.

[33] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[34] Teruo Higashino,et al. Edge-centric Computing: Vision and Challenges , 2015, CCRV.

[35] Saibal Mukhopadhyay,et al. Attention-Based Activation Pruning to Reduce Data Movement in Real-Time AI: A Case-Study on Local Motion Planning in Autonomous Vehicles , 2020, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[36] Jan-Erik Eklund,et al. VLSI implementation of a focal plane image processor-a realization of the near-sensor image processing concept , 1996, IEEE Trans. Very Large Scale Integr. Syst..

[37] Christof Koch,et al. Feature combination strategies for saliency-based visual attention systems , 2001, J. Electronic Imaging.