论文信息 - SAPPHIRE: An always-on context-aware computer vision system for portable devices

SAPPHIRE: An always-on context-aware computer vision system for portable devices

Being aware of objects in the ambient provides a new dimension of context awareness. Towards this goal, we present a system that exploits powerful computer vision algorithms in the cloud by collecting data through always-on cameras on portable devices. To reduce communication-energy costs, our system allows client devices to continually analyze streams of video and distill out frames that contain objects of interest. Through a dedicated image-classification engine SAPPHIRE, we show that if an object is found in 5% of all frames, we end up selecting 30% of them to be able to detect the object 90% of the time: 70% data reduction on the client device at a cost of ≤ 60 mW of power (45 nm ASIC). By doing so, we demonstrate system-level energy reductions of ≥ 2×. Thanks to multiple levels of pipelining and parallel vector-reduction stages, SAPPHIRE consumes only 3.0 mJ/frame and 38 pJ/OP - estimated to be lower by 11.4× than a 45 nm GPU - and a slightly higher level of peak performance (29 vs. 20 GFLOPS). Further, compared to a parallelized sofware implementation on a mobile CPU, it provides a processing speed up of up to 235× (1.81 s vs. 7.7 ms/frame), which is necessary to meet the real-time processing needs of an always-on context-aware system.

[1] Thomas Mensink,et al. Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[2] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[3] Syed Ali Khayam,et al. Energy efficient video compression for wireless sensor networks , 2009, 2009 43rd Annual Conference on Information Sciences and Systems.

[4] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[5] Roberto Cipolla,et al. Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[6] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7] Paramvir Bahl,et al. VISION: cloud-powered sight for all: showing the cloud what you see , 2012, MCS '12.

[8] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[9] Y. LeCun,et al. Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10] Paramvir Bahl,et al. Energy characterization and optimization of image sensing toward continuous mobile vision , 2013, MobiSys '13.

[11] Matthew A. Brown,et al. Picking the best DAISY , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12] António de Sousa. Smart Cameras in Embedded Systems , 2020 .

[13] Amine Bermak,et al. A CMOS Image Sensor With On-Chip Image Compression Based on Predictive Boundary Adaptation and Memoryless QTD Algorithm , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[14] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[15] Alec Wolman,et al. MAUI: making smartphones last longer with code offload , 2010, MobiSys '10.

[16] Jie Liu,et al. Energy scaling in multi-tiered sensing systems through compressive sensing , 2014, Proceedings of the IEEE 2014 Custom Integrated Circuits Conference.

[17] G. Griffin,et al. Caltech-256 Object Category Dataset , 2007 .