Probabilistic Hierarchical Model Using First Person Vision for Scenario Recognition

Smart homes are becoming a growing need to prepare for a comfortable life style for the elderly and make things easy for the caretakers of the future. One important component of these systems is to identify the human activities and scenarios. As the wireless technologies are becoming advanced, they are being used to provide a low-cost, non-intrusive and privacy-conscious solution to activity recognition. However, in more complicated environments, we need to identify scenarios with subtle cues e.g. eye gaze. These situations call for a complementary vision-based solution, and we present a robust scenario recognition system by following the objects seen in eye gaze trajectories. In this paper, we present a probabilistic hierarchical model for scenario recognition using the environmental elements like objects in the scene. We utilize the fact that any scenario can be divided into constituent tasks and activities recursively to the level of atomic actions and objects. Considering bottom-up, the scenario recognition problem can be hierarchically solved by identifying the objects seen and combining them together to form coarse-grained higher level activities. This is a novel contribution to be able to recognize complete scenarios only on the basis of objects seen. We performed experiments on standard datasets of Georgia Tech Egocentric Activities (GTEA-Gaze) and unconstrained videos collected “in the Wild”; and trained an Artificial Neural Network to get a precision of 73.84% and accuracy of 92.27%.

[1]  Li-Chen Fu,et al.  Robust Location-Aware Activity Recognition Using Wireless Sensor Network in an Attentive Home , 2009, IEEE Transactions on Automation Science and Engineering.

[2]  Carlos Sagüés,et al.  Human-Computer Interaction Based on Hand Gestures Using RGB-D Sensors , 2013, Sensors.

[3]  Syed Mahfuzul Aziz,et al.  Sensor Anomaly Detection in Wireless Sensor Networks for Healthcare , 2015, Sensors.

[4]  Sangyoun Lee,et al.  Depth Camera-Based 3D Hand Gesture Controls with Immersive Tactile Feedback for Natural Mid-Air Gesture Interactions , 2015, Sensors.

[5]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[6]  Han Su,et al.  A Wireless Signal Denoising Model for Human Activity Recognition , 2017 .

[7]  P. Hall,et al.  On-body path gain variations with changing body posture and antenna position , 2005, 2005 IEEE Antennas and Propagation Society International Symposium.

[8]  Wei Xi,et al.  Device-free Human Activity Recognition using CSI , 2015, CSAR@SenSys.

[9]  Gang Zhou,et al.  RadioSense: Exploiting Wireless Communication Patterns for Body Sensor Network Activity Recognition , 2012, 2012 IEEE 33rd Real-Time Systems Symposium.

[10]  Jure Leskovec,et al.  Community-Affiliation Graph Model for Overlapping Network Community Detection , 2012, 2012 IEEE 12th International Conference on Data Mining.

[11]  George L. Malcolm,et al.  Eye Movements and Visual Encoding During Scene Perception , 2009, Psychological science.

[12]  Dina Katabi,et al.  RF-IDraw: virtual touch screen in the air using RF signals , 2014, S3@MobiCom.

[13]  Samy Bengio,et al.  Links between perceptrons, MLPs and SVMs , 2004, ICML.

[14]  Yi Wang,et al.  Robust Indoor Human Activity Recognition Using Wireless Signals , 2015, Sensors.

[15]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[16]  Fangmin Li,et al.  WiGeR: WiFi-Based Gesture Recognition System , 2016, ISPRS Int. J. Geo Inf..

[17]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[18]  Moritz Tenorth,et al.  KNOWROB-MAP - knowledge-linked semantic object maps , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[19]  Shuangquan Wang,et al.  A review on radio based activity recognition , 2015, Digit. Commun. Networks.

[20]  Younghwan Yoo,et al.  User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm , 2015, Sensors.

[21]  Takeo Kanade,et al.  First-Person Vision , 2012, Proceedings of the IEEE.

[22]  Shwetak N. Patel,et al.  Whole-home gesture recognition using wireless signals , 2013, MobiCom.

[23]  Jue Wang,et al.  RF-IDraw: virtual touch screen in the air using RF signals , 2015, SIGCOMM 2015.

[24]  J. Henderson,et al.  High-level scene perception. , 1999, Annual review of psychology.

[25]  Rakesh Gupta,et al.  Common Sense Data Acquisition for Indoor Mobile Robots , 2004, AAAI.

[26]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[27]  Xin-Lin Huang,et al.  Static Human Detection and Scenario Recognition via Wearable Thermal Sensing System , 2017, Comput..

[28]  Shuangquan Wang,et al.  A Theoretical Analysis of Path Loss Based Activity Recognition , 2014, 2014 IEEE 11th International Conference on Mobile Ad Hoc and Sensor Systems.

[29]  Desney S. Tan,et al.  SoundWave: using the doppler effect to sense gestures , 2012, CHI.