ZenCam: Context-Driven Control of Autonomous Body Cameras

In this paper, we present — ZenCam, which is an always-on body camera that exploits readily available information in the encoded video stream from the on-chip firmware to classify the dynamics of the scene. This scene-context is further combined with simple inertial measurement unit (IMU)-based activity level-context of the wearer to optimally control the camera configuration at run-time to keep the device under the desired energy budget. We describe the design and implementation of ZenCam and thoroughly evaluate its performance in real-world scenarios. Our evaluation shows a 29.8-35% reduction in energy consumption and 48.1-49.5% reduction in storage usage when compared to a standard baseline setting of 1920x1080 at 30fps while maintaining a competitive or better video quality at the minimal computational overhead.

[1]  Roland Siegwart,et al.  Dense visual-inertial navigation system for mobile robots , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Dinesh Manocha,et al.  Realtime Anomaly Detection Using Trajectory-Level Crowd Behavior Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3]  Margaret H. Pinson,et al.  A new standardized method for objectively measuring video quality , 2004, IEEE Transactions on Broadcasting.

[4]  Iain E. G. Richardson,et al.  H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia , 2003 .

[5]  Seungyeop Han,et al.  GlimpseData: towards continuous vision-based personal analytics , 2014, WPA@MobiSys.

[6]  Warwick Gillespie,et al.  Classification of Video Sequences in MPEG Domain , 2005 .

[7]  Zhan Ma,et al.  Modeling the impact of frame rate on perceptual quality of video , 2008, 2008 15th IEEE International Conference on Image Processing.

[8]  C.-M. Mak,et al.  Real-time video object segmentation in H.264 compressed domain , 2009, IET Image Process..

[9]  Oliver Hohlfeld,et al.  Impact of frame rate and resolution on objective QoE metrics , 2010, 2010 Second International Workshop on Quality of Multimedia Experience (QoMEX).

[10]  José Ruíz Ascencio,et al.  Visual simultaneous localization and mapping: a survey , 2012, Artificial Intelligence Review.

[11]  R. Venkatesh Babu,et al.  H.264 compressed video classification using Histogram of Oriented Motion Vectors (HOMV) , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[13]  Nuno Vasconcelos,et al.  Anomaly Detection and Localization in Crowded Scenes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Héctor Pomares,et al.  Multiwindow Fusion for Wearable Activity Recognition , 2015, IWANN.

[15]  R. Venkatesh Babu,et al.  A survey on compressed domain video analysis techniques , 2014, Multimedia Tools and Applications.

[16]  Liuping Wang,et al.  Model Predictive Control System Design and Implementation Using MATLAB , 2009 .

[17]  Bernt Schiele,et al.  South by South-East or Sitting at the Desk: Can Orientation be a Place? , 2011, 2011 15th Annual International Symposium on Wearable Computers.

[18]  Gerhard Tröster,et al.  Robust Recognition of Reading Activity in Transit Using Wearable Electrooculography , 2009, Pervasive.

[19]  R. Venkatesh Babu,et al.  Real time anomaly detection in H.264 compressed videos , 2013, 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG).

[20]  Qiang Li,et al.  MusicalHeart: a hearty way of listening to music , 2012, SenSys '12.

[21]  Anastasis A. Sofokleous,et al.  Review: H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia , 2005, Comput. J..

[22]  Bernt Schiele,et al.  A tutorial on human activity recognition using body-worn inertial sensors , 2014, CSUR.

[23]  Mingyu Li,et al.  CodingFlow: Enable Video Coding for Video Stabilization , 2017, IEEE Transactions on Image Processing.

[24]  Shiwei Fang,et al.  Distributed Adaptive Model Predictive Control of a Cluster of Autonomous and Context-Sensitive Body Cameras , 2017, WearSys@MobiSys.

[25]  Jie Liu,et al.  Glimpse: A Programmable Early-Discard Camera Architecture for Continuous Mobile Vision , 2017, MobiSys.

[26]  Alec Wolman,et al.  MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints , 2016, MobiSys.

[27]  Paul Lukowicz,et al.  Towards Recognizing Tai Chi - An Initial Experiment Using Wearable Sensors , 2006 .