Low-power object counting with hierarchical neural networks

Deep Neural Networks (DNNs) achieve state-of-the-art accuracy in many computer vision tasks, such as object counting. Object counting takes two inputs: an image and an object query and reports the number of occurrences of the queried object. To achieve high accuracy, DNNs require billions of operations, making them difficult to deploy on resource-constrained, low-power devices. Prior work shows that a significant number of DNN operations are redundant and can be eliminated without affecting the accuracy. To reduce these redundancies, we propose a hierarchical DNN architecture for object counting. This architecture uses a Region Proposal Network (RPN) to propose regions-of-interest (RoIs) that may contain the queried objects. A hierarchical classifier then efficiently finds the RoIs that actually contain the queried objects. The hierarchy contains groups of visually similar object categories. Small DNNs at each node of the hierarchy classify between these groups. The RoIs are incrementally processed by the hierarchical classifier. If the object in an RoI is in the same group as the queried object, then the next DNN in the hierarchy processes the RoI further; otherwise, the RoI is discarded. By using a few small DNNs to process each image, this method reduces the memory requirement, inference time, energy consumption, and number of operations with negligible accuracy loss when compared with the existing techniques.

[1]  Quoc V. Le,et al.  Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Mark W. Schmidt,et al.  Where are the Blobs: Counting by Localization with Point Supervision , 2018, ECCV.

[3]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  George K. Thiruvathukal,et al.  Camera Placement Meeting Restrictions of Computer Vision , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[5]  George K. Thiruvathukal,et al.  Modular Neural Networks for Low-Power Image Classification on Embedded Devices , 2020, ACM Trans. Design Autom. Electr. Syst..

[6]  Paolo Napoletano,et al.  Benchmark Analysis of Representative Deep Neural Network Architectures , 2018, IEEE Access.

[7]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[8]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[9]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[10]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[11]  George K. Thiruvathukal,et al.  A Survey of Methods for Low-Power Deep Learning and Computer Vision , 2020, 2020 IEEE 6th World Forum on Internet of Things (WF-IoT).

[12]  Pietro Perona,et al.  Learning and using taxonomies for fast visual categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[14]  Priyadarshini Panda,et al.  Tree-CNN: A hierarchical Deep Convolutional Neural Network for incremental learning , 2018, Neural Networks.

[15]  Ting Yu,et al.  Unified Crowd Segmentation , 2008, ECCV.

[16]  Suresh Padmanabhan,et al.  Visual positioning system for automated indoor/outdoor navigation , 2017, TENCON 2017 - 2017 IEEE Region 10 Conference.

[17]  Forrest N. Iandola,et al.  SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[18]  Ramprasaath R. Selvaraju,et al.  Counting Everyday Objects in Everyday Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Parami Wijesinghe,et al.  FALCON: Feature Driven Selective Classification for Energy-Efficient Image Recognition , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[20]  Xuemin Chen,et al.  Internet of video things in 2030: A world with many cameras , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[21]  George K. Thiruvathukal,et al.  Low-Power Computer Vision: Status, Challenges, and Opportunities , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[22]  George K. Thiruvathukal,et al.  Observing Responses to the COVID-19 Pandemic using Worldwide Network Cameras , 2020, ArXiv.

[23]  R. D. Blanton,et al.  CompactNet: High Accuracy Deep Neural Network Optimized for On-Chip Implementation , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[24]  Frank Hutter,et al.  Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.

[25]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[26]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.