NeuriCam: Key-Frame Video Super-Resolution and Colorization for IoT Cameras

We present NeuriCam, a novel deep learning-based system to achieve video capture from low-power dual-mode IoT camera systems. Our idea is to design a dual-mode camera system where the first mode is low-power (1.1 mW) but only outputs grey-scale, low resolution, and noisy video and the second mode consumes much higher power (100 mW) but outputs color and higher resolution images. To reduce total energy consumption, we heavily duty cycle the high power mode to output an image only once every second. The data for this camera system is then wirelessly sent to a nearby plugged-in gateway, where we run our real-time neural network decoder to reconstruct a higher-resolution color video. To achieve this, we introduce an attention feature filter mechanism that assigns different weights to different features, based on the correlation between the feature map and the contents of the input frame at each spatial location. We design a wireless hardware prototype using off-the-shelf cameras and address practical issues including packet loss and perspective mismatch. Our evaluations show that our dual-camera approach reduces energy consumption by 7x compared to existing systems. Further, our model achieves an average greyscale PSNR gain of 3.7 dB over prior single and dual-camera video super-resolution methods and 5.6 dB RGB gain over prior color propagation methods. Open-source code: https://github.com/vb000/NeuriCam.

[1]  Sunghyun Cho,et al.  Reference-based Video Super-Resolution Using Multi-Camera Video Triplets , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Xin Feng,et al.  Design of a light and high-resolution video camera system based on CMV12000 sensor , 2022, Symposium on Novel Photoelectronic Detection Technology and Application.

[3]  D. Blaauw,et al.  Millimeter-Scale Ultra-Low-Power Imaging System for Intelligent Edge Monitoring , 2022, 2203.04496.

[4]  P. Mercier,et al.  A WiFi and Bluetooth Backscattering Combo Chip Featuring Beam Steering via a Fully-Reflective Phased-Controlled Multi-Antenna Termination Technique Enabling Operation Over 56 Meters , 2022, 2022 IEEE International Solid- State Circuits Conference (ISSCC).

[5]  Mohammad Rastegari,et al.  MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer , 2021, ICLR.

[6]  Shangchen Zhou,et al.  BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Qifeng Chen,et al.  Dual-Camera Super-Resolution with Aligned Attention Modules , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Fukashi Morishita,et al.  A CMOS Image Sensor and an AI Accelerator for Realizing Edge-Computing-Based Surveillance Camera Systems , 2021, 2021 Symposium on VLSI Circuits.

[9]  Radu Timofte,et al.  Real-Time Video Super-Resolution on Smartphones with Deep Learning, Mobile AI 2021 Challenge: Report , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Mohammad Alizadeh,et al.  Efficient Video Compression via Content-Adaptive Super-Resolution , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Alfio Di Mauro,et al.  4.4 A 1.3TOPS/W @ 32GOPS Fully Integrated 10-Core SoC for IoT End-Nodes with 1.7μW Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).

[12]  David Bol,et al.  A 0.2-to-3.6TOPS/W Programmable Convolutional Imager SoC with In-Sensor Current-Domain Ternary-Weighted MAC Operations for Feature Extraction and Region-of-Interest Detection , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).

[13]  Huazhong Yang,et al.  MACSen: A Processing-In-Sensor Architecture Integrating MAC Operations Into Image Sensor for Ultra-Low-Power BNN-Based Intelligent Visual Perception , 2021, IEEE Transactions on Circuits and Systems - II - Express Briefs.

[14]  Pang Wei Koh,et al.  WILDS: A Benchmark of in-the-Wild Distribution Shifts , 2020, ICML.

[15]  Chen Change Loy,et al.  BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yulan Guo,et al.  Symmetric Parallax Attention for Stereo Image Super-Resolution , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[18]  M. Salman Asif,et al.  A Dual Camera System for High Spatiotemporal Resolution Video Acquisition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Junha Im,et al.  Starfish: resilient image compression for AIoT cameras , 2020, SenSys.

[20]  Dongsu Han,et al.  NEMO: enabling neural-enhanced video streaming on commodity mobile devices , 2020, MobiCom.

[21]  Qi Tian,et al.  Video Super-Resolution with Recurrent Structure-Detail Network , 2020, ECCV.

[22]  Dongsu Han,et al.  Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning , 2020, SIGCOMM.

[23]  Ali Najafi,et al.  Wireless steerable vision for live insects and insect-scale robots , 2020, Science Robotics.

[24]  In So Kweon,et al.  Robust Reference-Based Super-Resolution With Similarity-Aware Deformable Convolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Chenliang Xu,et al.  TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Junjun Jiang,et al.  Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlations , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Radu Timofte,et al.  NTIRE 2019 Challenge on Video Deblurring and Super-Resolution: Dataset and Study , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[28]  Chen Change Loy,et al.  EDVR: Video Restoration With Enhanced Deformable Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  Lei Yang,et al.  Wireless Computer Vision Using Commodity Radios , 2019, 2019 18th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).

[30]  Joshua R. Smith,et al.  Battery-Free Wireless Video Streaming Camera System , 2019, 2019 IEEE International Conference on RFID (RFID).

[31]  Gregory Shakhnarovich,et al.  Recurrent Back-Projection Network for Video Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Hairong Qi,et al.  Image Super-Resolution by Neural Texture Transfer , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Marian Verhelst,et al.  2.2 A 978GOPS/W Flexible Streaming Processor for Real-Time Image Processing Applications in 22nm FDSOI , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).

[34]  Xiaoyun Zhang,et al.  DVC: An End-To-End Deep Video Compression Framework , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Stephen Lin,et al.  Deformable ConvNets V2: More Deformable, Better Results , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Steve Branson,et al.  Learned Video Compression , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Markus H. Gross,et al.  Deep Video Color Propagation , 2018, BMVC.

[38]  Lu Fang,et al.  CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping , 2018, ECCV.

[39]  Dongdong Chen,et al.  Deep exemplar-based colorization , 2018, ACM Trans. Graph..

[40]  Seoung Wug Oh,et al.  Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Chao-Yuan Wu,et al.  Video Compression through Image Interpolation , 2018, ECCV.

[42]  W. Freeman,et al.  Video Enhancement with Task-Oriented Flow , 2017, International Journal of Computer Vision.

[43]  Joshua R. Smith,et al.  Towards Battery-Free HD Video Streaming , 2018, NSDI.

[44]  Dongsu Han,et al.  How will Deep Learning Change Internet Video Delivery? , 2017, HotNets.

[45]  Xianming Liu,et al.  Robust Video Super-Resolution with Learned Temporal Dynamics , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Nabil J. Sarhan,et al.  Modeling and Analysis of Power Consumption in Live Video Streaming Systems , 2017, ACM Trans. Multim. Comput. Commun. Appl..

[47]  Jie Liu,et al.  Glimpse: A Programmable Early-Discard Camera Architecture for Continuous Mobile Vision , 2017, MobiSys.

[48]  Alexei A. Efros,et al.  Real-time user-guided image colorization with learned deep priors , 2017, ACM Trans. Graph..

[49]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Peter V. Gehler,et al.  Video Propagation Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Fisher Yu,et al.  Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Christian Ledig,et al.  Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Lu Fang,et al.  Learning Cross-scale Correspondence and Patch-based Synthesis for Reference-based Super-Resolution , 2017, BMVC.

[55]  Markus H. Gross,et al.  Phase-Based Modification Transfer for Video , 2016, ECCV.

[56]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  ByongChan Lim,et al.  A 220pJ/pixel/frame CMOS image sensor with partial settling readout architecture , 2016, 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits).

[58]  Luc Van Gool,et al.  Dynamic Filter Networks , 2016, NIPS.

[59]  Jun Chen,et al.  Energy-Efficient Image Compressive Transmission for Wireless Camera Networks , 2016, IEEE Sensors Journal.

[60]  Yoshihiro Kanamori,et al.  DeepProp: Extracting Deep Features from a Single Image for Edit Propagation , 2016, Comput. Graph. Forum.

[61]  Edgar Simo-Serra,et al.  Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification , 2016 .

[62]  Liang Wang,et al.  Bidirectional Recurrent Convolutional Networks for Multi-Frame Super-Resolution , 2015, NIPS.

[63]  Bin Sheng,et al.  Deep Colorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[64]  Pan Hu,et al.  EkhoNet: High-Speed Ultra Low-Power Backscatter for Next Generation Sensors , 2015, GETMBL.

[65]  Joshua R. Smith,et al.  Powering the next billion devices with wi-fi , 2015, CoNEXT.

[66]  Joshua R. Smith,et al.  WISPCam: A battery-free RFID camera , 2015, 2015 IEEE International Conference on RFID (RFID).

[67]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[68]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[69]  Deqing Sun,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 on Bayesian Adaptive Video Super Resolution , 2022 .

[70]  Paramvir Bahl,et al.  Energy characterization and optimization of image sensing toward continuous mobile vision , 2013, MobiSys '13.

[71]  Shahram Shirani,et al.  Regularization function for video super-resolution using auxiliary high resolution still images , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[72]  Edson M. Hung,et al.  Video Super-Resolution Using Codebooks Derived From Key-Frames , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[73]  Jean-Marie Moureaux,et al.  FPGA-based image compression for low-power Wireless Camera Sensor Networks , 2011, 2011 3rd International Conference on Next Generation Networks and Services (NGNS).

[74]  Byung Cheol Song,et al.  Video Super-Resolution Algorithm Using Bi-Directional Overlapped Block Motion Compensation and On-the-Fly Dictionary Training , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[75]  Asral Bahari,et al.  Low-Power H.264 Video Compression Architectures for Mobile Communication , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[76]  Aude Billard,et al.  WearCam: A head mounted wireless camera for monitoring gaze attention and for the diagnosis of developmental disorders in young children , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[77]  Harry Shum,et al.  Natural Image Colorization , 2007, Rendering Techniques.

[78]  Shoji Kawahito,et al.  A 2.6mW 2fps QVGA CMOS one-chip wireless camera with digital image transmission function for capsule endoscopes , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[79]  Dani Lischinski,et al.  Colorization using optimization , 2004, ACM Trans. Graph..

[80]  Anantha Chandrakasan,et al.  A low-power wireless camera system , 1999, Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013).

[81]  Brita H. Olson,et al.  A highly miniaturized, battery operated, commandable, digital wireless camera , 1998, IEEE Military Communications Conference. Proceedings. MILCOM 98 (Cat. No.98CH36201).

[82]  Michel Barlaud,et al.  Two deterministic half-quadratic regularization algorithms for computed imaging , 1994, Proceedings of 1st International Conference on Image Processing.

[83]  Stuart Kleinfelder,et al.  Cmos Image Sensors Dynamic Range and Snr Enhancement via Statistical Signal Processing , 2022 .