A two-fold transformation model for human action recognition using decisive pose

Abstract Human action recognition in videos is a tough task due to the complex background, geometrical transformation and an enormous volume of data. Hence, to address these issues, an effective algorithm is developed, which can identify human action in videos using a single decisive pose. To achieve the task, a decisive pose is extracted using optical flow, and further, feature extraction is done via a two-fold transformation of wavelet. The two-fold transformation is done via Gabor Wavelet Transform (GWT) and Ridgelet Transform (RT). The GWT produces a feature vector by calculating first-order statistics values of different scale and orientations of an input pose, which have robustness against translation, scaling and rotation. The orientation-dependent shape characteristics of human action are computed using RT. The fusion of these features gives a robust unified algorithm. The effectiveness of the algorithm is measured on four publicly datasets i.e. KTH, Weizmann, Ballet Movement, and UT Interaction and accuracy reported on these datasets are 96.66%, 96%, 92.75% and 100%, respectively. The comparison of accuracies with similar state-of-the-arts shows superior performance.

[1]  Lun-zheng Tan,et al.  Human action recognition based on chaotic invariants , 2013, Journal of Central South University.

[2]  Kin-Man Lam,et al.  Efficient Edge Detection Using Simplified Gabor Wavelets , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Shaharyar Kamal,et al.  Dense RGB-D Map-Based Human Tracking and Activity Recognition using Skin Joints Features and Self-Organizing Map , 2015, KSII Trans. Internet Inf. Syst..

[4]  Greg Mori,et al.  Action recognition by learning mid-level motion features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Zicheng Liu,et al.  Hierarchical Filtered Motion for Action Recognition in Crowded Videos , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[7]  Alexandros Iosifidis,et al.  Discriminant Bag of Words based representation for human action recognition , 2014, Pattern Recognit. Lett..

[8]  Arivazhagan Selvaraj,et al.  Texture classification using Gabor wavelets based rotation invariant features , 2006, Pattern Recognit. Lett..

[9]  Guohui Tian,et al.  Human typical action recognition using gray scale image of silhouette sequence , 2012, Comput. Electr. Eng..

[10]  Shaharyar Kamal,et al.  A Hybrid Feature Extraction Approach for Human Detection, Tracking and Activity Recognition Using Depth Sensors , 2016 .

[11]  Hongxun Yao,et al.  Breaking video into pieces for action recognition , 2017, Multimedia Tools and Applications.

[12]  Tanaya Guha,et al.  Learning Sparse Representations for Human Action Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Tej Singh,et al.  Video benchmarks of human action datasets: a review , 2018, Artificial Intelligence Review.

[14]  Wumo Pan,et al.  Rotation invariant texture classification by ridgelet transform and frequency-orientation space decomposition , 2008, Signal Process..

[15]  Daijin Kim,et al.  Individual detection-tracking-recognition using depth activity images , 2015, 2015 12th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI).

[16]  Arivazhagan Selvaraj,et al.  Texture classification using ridgelet transform , 2005, Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA'05).

[17]  Milind Rane,et al.  Face recognition based on ridgelet transforms , 2010, Biometrics Technology.

[18]  Xin Lu,et al.  Recognizing non-rigid human actions using joints tracking in space-time , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[19]  Elsayed E. Hemayed,et al.  Comparative study for feature detectors in human activity recognition , 2013, 2013 9th International Computer Engineering Conference (ICENCO).

[20]  Satoshi Yonemoto,et al.  Vision-based real-time motion capture system using multiple cameras , 2003, Proceedings of IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI2003..

[21]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[22]  Ahmad Jalal,et al.  A Triaxial Acceleration-based Human Motion Detection for Ambient Smart Home System , 2019, 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST).

[23]  J. Weickert,et al.  Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods , 2005 .

[24]  Ahmad Jalal,et al.  Wearable Sensor-Based Human Behavior Understanding and Recognition in Daily Life for Smart Environments , 2018, 2018 International Conference on Frontiers of Information Technology (FIT).

[25]  Mohsen Soryani,et al.  Body posture graph: a new graph-based posture descriptor for human behaviour recognition , 2013, IET Comput. Vis..

[26]  Haitao Wu,et al.  Human activity recognition based on the combined SVM&HMM , 2014, 2014 IEEE International Conference on Information and Automation (ICIA).

[27]  Li Yan,et al.  Geometric-constrained multi-view image matching method based on semi-global optimization , 2018, Geo spatial Inf. Sci..

[28]  Rajiv Kapoor,et al.  Hybrid classifier based human activity recognition using the silhouette and cells , 2015, Expert Syst. Appl..

[29]  Tae Jong Choi,et al.  Multi-objective evolutionary approach to select security solutions , 2017, CAAI Trans. Intell. Technol..

[30]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Kuldeep Singh,et al.  Human Activity Recognition Based on Spatial Distribution of Gradients at Sublevels of Average Energy Silhouette Images , 2017, IEEE Transactions on Cognitive and Developmental Systems.

[32]  Qiang Ji,et al.  Video event recognition with deep hierarchical context model , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Tai Sing Lee,et al.  Image Representation Using 2D Gabor Wavelets , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Xinchao Zhao,et al.  Second-order DE algorithm , 2017, CAAI Trans. Intell. Technol..

[35]  Rajiv Kapoor,et al.  A proposed unified framework for the recognition of human activity by exploiting the characteristics of action dynamics , 2016, Robotics Auton. Syst..

[36]  S. Swetha,et al.  Human action recognition from RGB-D data using complete local binary pattern , 2019, Cognitive Systems Research.

[37]  Zhenyang Wu,et al.  A Hybrid Method for Human Interaction Recognition Using Spatio-temporal Interest Points , 2014, 2014 22nd International Conference on Pattern Recognition.

[38]  Jake K. Aggarwal,et al.  Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[39]  Ales Procházka,et al.  Satellite image processing and air pollution detection , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[40]  Dinesh Kumar Vishwakarma,et al.  A review of state-of-the-art techniques for abnormal human activity recognition , 2019, Eng. Appl. Artif. Intell..

[41]  Shuyuan Yang,et al.  Image Noise Reduction via Geometric Multiscale Ridgelet Support Vector Transform and Dictionary Learning , 2013, IEEE Transactions on Image Processing.

[42]  Hongbin Zha,et al.  Local spatio-temporal feature based voting framework for complex human activity detection and localization , 2011, The First Asian Conference on Pattern Recognition.

[43]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[44]  Jie Yang,et al.  Person re-identification across multi-camera system based on local descriptors , 2012, 2012 Sixth International Conference on Distributed Smart Cameras (ICDSC).

[45]  Tsuhan Chen,et al.  Object color categorization in surveillance videos , 2011, 2011 18th IEEE International Conference on Image Processing.

[46]  Amit Jain,et al.  A multiscale representation including opponent color features for texture recognition , 1998, IEEE Trans. Image Process..

[47]  Dinesh Kumar Vishwakarma,et al.  A Robust Framework for Abnormal Human Action Recognition Using $\boldsymbol{\mathcal{R}}$ -Transform and Zernike Moments in Depth Videos , 2019, IEEE Sensors Journal.

[48]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[49]  M. Kalaiselvi Geetha,et al.  Behavior recognition in surveillance video using temporal features , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[50]  Hong Liu,et al.  Human action classification based on sequential bag-of-words model , 2014, 2014 IEEE International Conference on Robotics and Biomimetics (ROBIO 2014).

[51]  Bala Srinivasan,et al.  Adaptive mobile activity recognition system with evolving data streams , 2015, Neurocomputing.

[52]  Yin Liu,et al.  Human action recognition using spatio-temoporal descriptor , 2013, 2013 6th International Congress on Image and Signal Processing (CISP).

[53]  Joris De Schutter,et al.  An adaptable system for RGB-D based human body detection and pose estimation , 2014, J. Vis. Commun. Image Represent..

[54]  Mohamad M. Awad,et al.  Forest mapping: a comparison between hyperspectral and multispectral images and technologies , 2017, Journal of Forestry Research.

[55]  Minh N. Do,et al.  The finite ridgelet transform for image representation , 2003, IEEE Trans. Image Process..

[56]  Xilin Chen,et al.  Activity recognition based on semantic spatial relation , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[57]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[58]  Rajiv Kapoor,et al.  Human Activity Recognition Using Gabor Wavelet Transform and Ridgelet Transform , 2015 .

[59]  Laércio Massaru Namikawa,et al.  Digital Image Processing in Remote Sensing , 2009, 2009 Tutorials of the XXII Brazilian Symposium on Computer Graphics and Image Processing.

[60]  Lasitha Piyathilaka,et al.  Gaussian mixture based HMM for human daily activity recognition using 3D skeleton features , 2013, 2013 IEEE 8th Conference on Industrial Electronics and Applications (ICIEA).

[61]  Rajiv Kapoor,et al.  Unified framework for human activity recognition: An approach using spatial edge distribution and ℜ-transform , 2016 .

[62]  Changyin Sun,et al.  Action recognition using direction-dependent feature pairs and non-negative low rank sparse model , 2015, Neurocomputing.

[63]  J. Daugman Two-dimensional spectral analysis of cortical receptive field profiles , 1980, Vision Research.

[64]  Daijin Kim,et al.  Depth Images-based Human Detection, Tracking and Activity Recognition Using Spatiotemporal Features and Modified HMM , 2016 .

[65]  Dinesh Kumar Vishwakarma,et al.  Covariate Conscious Approach for Gait Recognition Based Upon Zernike Moment Invariants , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[66]  Yaser Sheikh,et al.  Exploring the space of a human action , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[67]  Taysir Hassan A. Soliman,et al.  A spatiotemporal algebra in Hadoop for moving objects , 2018, Geo spatial Inf. Sci..

[68]  Odemir Martinez Bruno,et al.  Gabor wavelets combined with volumetric fractal dimension applied to texture analysis , 2014, Pattern Recognit. Lett..

[69]  Kuldeep Singh,et al.  Convolutional neural networks for crowd behaviour analysis: a survey , 2019, The Visual Computer.

[70]  Awais Ahmad,et al.  Real-time continuous feature extraction in large size satellite images , 2016, J. Syst. Archit..