Depth silhouettes context: A new robust feature for human tracking and activity recognition based on embedded HMMs

Activity and action detection, tracking and recognition are very demanding research area in computer vision and human computer interaction. In this paper, a video-based novel approach for human activity recognition is presented using robust hybrid features and embedded Hidden Markov Models. In the proposed HAR framework, depth maps are analyzed by temporal motion identification method to segment human silhouettes from noisy background and compute depth silhouette area for each activity to track human movements in a scene. Several representative features, including invariant, depth sequential silhouettes and spatiotemporal body joints features were fused together to explore gradient orientation change, intensity differentiation, temporal variation and local motion of specific body parts. Then, these features are processed by the dynamics of their respective class and learned, trained and recognized with specific embedded HMM having active feature values. Our experiments on two depth datasets demonstrate that the proposed features are efficient and robust over the state of the arts features for human activity recognition especially when there are similar postures of different activities.

[1]  Shaharyar Kamal,et al.  Real-time life logging via a depth silhouette-based human activity recognition system for smart home services , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[2]  Monique Thonnat,et al.  Activity recognition and uncertain knowledge in video scenes , 2013, 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[3]  Ling Shao,et al.  Learning Discriminative Representations from RGB-D Video Data , 2013, IJCAI.

[4]  Tae-Seong Kim,et al.  Human Activity Recognition via Recognized Body Parts of Human Depth Silhouettes for Residents Monitoring Services at Smart Home , 2013 .

[5]  Paul M. Baggenstoss A modified Baum-Welch algorithm for hidden Markov models with multiple observation spaces , 2001, IEEE Trans. Speech Audio Process..

[6]  Sangwook Kim,et al.  Algorithmic implementation and efficiency maintenance of real-time environment using low-bitrate wireless communication , 2006, The Fourth IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems, and the Second International Workshop on Collaborative Computing, Integration, and Assurance (SEUS-WCCIA'06).

[7]  Mian Ahmad Zeb,et al.  Security and QoS Optimization for Distributed Real Time Environment , 2007, 7th IEEE International Conference on Computer and Information Technology (CIT 2007).

[8]  Meinard Müller,et al.  Motion templates for automatic classification and retrieval of motion capture data , 2006, SCA '06.

[9]  Ahmad Jalal,et al.  A Complexity Removal in the Floating Point and Rate Control Phenomenon , 2005 .

[10]  Weihua Sheng,et al.  Human daily activity recognition in robot-assisted living using multi-sensor fusion , 2009, 2009 IEEE International Conference on Robotics and Automation.

[11]  Radha Poovendran,et al.  Activity Recognition Using a Combination of Category Components and Local Models for Video Surveillance , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Paul Lukowicz,et al.  Dealing with human variability in motion based, wearable activity recognition , 2014, 2014 IEEE International Conference on Pervasive Computing and Communication Workshops (PERCOM WORKSHOPS).

[13]  Ahmad Jalal,et al.  Collaboration Achievement along with Performance Maintenance in Video Streaming , 2007 .

[14]  Tae-Seong Kim,et al.  Recognition of Human Home Activities via Depth Silhouettes and ℜ Transformation for Smart Homes , 2012 .

[15]  Majid Sarrafzadeh,et al.  Co-recognition of Human Activity and Sensor Location via Compressed Sensing in Wearable Body Sensor Networks , 2012, 2012 Ninth International Conference on Wearable and Implantable Body Sensor Networks.

[16]  A. Jalal,et al.  Security Architecture for Third Generation (3G) using GMHS Cellular Network , 2007, 2007 International Conference on Emerging Technologies.

[17]  Shaharyar Kamal,et al.  Dense RGB-D Map-Based Human Tracking and Activity Recognition using Skin Joints Features and Self-Organizing Map , 2015, KSII Trans. Internet Inf. Syst..

[18]  Ahmad Jalal,et al.  Advanced Performance Achievement using Multi- Algorithmic Approach of Video Transcoder for Low Bitrate Wireless Communication , 2005 .

[19]  Ahmad Jalal,et al.  Global Security Using Human Face Understanding under Vision Ubiquitous Architecture System , 2008 .

[20]  ChellappaRama,et al.  Matching Shape Sequences in Video with Applications in Human Movement Analysis , 2005 .

[21]  Daijin Kim,et al.  Ridge body parts features for human pose estimation and recognition from RGB-D video data , 2014, Fifth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[22]  Tae-Seong Kim,et al.  Human Activity Recognition via the Features of Labeled Depth Body Parts , 2012, ICOST.

[23]  Alexandros André Chaaraoui,et al.  Fusion of Skeletal and Silhouette-Based Features for Human Action Recognition with RGB-D Devices , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[24]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Tae-Seong Kim,et al.  Daily Human Activity Recognition Using Depth Silhouettes and R\mathcal{R} Transformation for Smart Home , 2011, ICOST.

[26]  Daijin Kim,et al.  Human daily activity recognition with joints plus body features representation using Kinect sensor , 2015, 2015 International Conference on Informatics, Electronics & Vision (ICIEV).

[27]  Tae-Seong Kim,et al.  Depth video-based human activity recognition system using translation and scaling invariant features for life logging at smart home , 2012, IEEE Transactions on Consumer Electronics.

[28]  Ahmad Jalal,et al.  Multiple Facial Feature Detection Using Vertex-Modeling Structure , 2007 .

[29]  Daijin Kim,et al.  Shape and Motion Features Approach for Activity Tracking and Recognition from Kinect Video Camera , 2015, 2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops.

[30]  Xiaodong Yang,et al.  EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[31]  Daijin Kim,et al.  A Depth Video Sensor-Based Life-Logging Human Activity Recognition System for Elderly Care in Smart Indoor Environments , 2014, Sensors.

[32]  Daijin Kim,et al.  A spatiotemporal motion variation features extraction approach for human tracking and pose-based action recognition , 2015, 2015 International Conference on Informatics, Electronics & Vision (ICIEV).

[33]  Cristian Sminchisescu,et al.  The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[34]  Daijin Kim,et al.  Depth map-based human activity tracking and recognition using body joints features and Self-Organized Map , 2014, Fifth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[35]  Ahmad Jalal,et al.  The Mechanism of Edge Detection using the Block Matching Criteria for the Motion Estimation , 2005 .

[36]  Yong Pei,et al.  Multilevel Depth and Image Fusion for Human Activity Detection , 2013, IEEE Transactions on Cybernetics.

[37]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[38]  Rama Chellappa,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Matching Shape Sequences in Video with Applications in Human Movement Analysis. Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2022 .

[39]  Ahmad Jalal,et al.  Security Enhancement for E-Learning Portal , 2008 .

[40]  Ahmad Jalal,et al.  Dense depth maps-based human pose tracking and recognition in dynamic scenes using ridge data , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[41]  A. Jalal,et al.  Assembled algorithm in the real-time H.263 codec for advanced performance , 2005, Proceedings of 7th International Workshop on Enterprise networking and Computing in Healthcare Industry, 2005. HEALTHCOM 2005..