Human Action Recognition Using Adaptive Local Motion Descriptor in Spark

Human action recognition plays a significant part in the computer vision and multimedia research society due to its numerous applications. However, despite different approaches proposed to address this problem, some issues regarding the robustness and efficiency of the action recognition still need to be solved. Moreover, due to the speedy development of multimedia applications from numerous origins, e.g., CCTV or video surveillance, there is an increasing demand for parallel processing of the large-scale video data. In this paper, we introduce a novel approach to recognize the human actions. First, we explore Apache spark with in-memory computing, to resolve the task of human action recognition in the distributed environment. Secondly, we introduce a novel feature descriptor, namely, adaptive local motion descriptor (ALMD) by considering motion and appearance, which is an extension of local ternary pattern used for static texture analysis, and ALMD also generate persistent codes to describe the local-textures. Finally, the spark machine learning library random forest is employed to recognize the human actions. Experimental results show the superiority of the proposed approach over other state-of-the-arts.

[1]  Ameet Talwalkar,et al.  MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..

[2]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jaehyoun Kim,et al.  A Local Feature-Based Robust Approach for Facial Expression Recognition from Depth Video , 2016, KSII Trans. Internet Inf. Syst..

[4]  Jake K. Aggarwal,et al.  Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Lidong Chen,et al.  An approach for fast and parallel video processing on Apache Hadoop clusters , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[6]  Amir Roshan Zamir,et al.  Action Recognition in Realistic Sports Videos , 2014 .

[7]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[8]  Xiaoyang Tan,et al.  Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions , 2007, IEEE Transactions on Image Processing.

[9]  Matej Kristan,et al.  Histograms of optical flow for efficient representation of body motion , 2010, Pattern Recognit. Lett..

[10]  Cheng Cheng,et al.  Large-scale multimedia data mining using MapReduce framework , 2012, 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings.

[11]  Yang Yi,et al.  Realistic action recognition with salient foreground trajectories , 2017, Expert Syst. Appl..

[12]  Jaehyoun Kim,et al.  Human Activity Recognition Using Spatiotemporal 3-D Body Joint Features with Hidden Markov Models , 2016, KSII Trans. Internet Inf. Syst..

[13]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  B. Rosenhahn,et al.  Computation strategies for volume local binary patterns applied to action recognition , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[15]  Loris Nanni,et al.  Local Ternary Patterns from Three Orthogonal Planes for human action classification , 2011, Expert Syst. Appl..

[16]  Weishan Zhang,et al.  A video cloud platform combing online and offline cloud computing technologies , 2015, Personal and Ubiquitous Computing.

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Md. Zia Uddin,et al.  Shape-Based Human Activity Recognition Using Independent Component Analysis and Hidden Markov Model , 2008, IEA/AIE.

[19]  Giancarlo Fortino,et al.  Facial Expression Recognition Utilizing Local Direction-Based Robust Features and Deep Belief Network , 2017, IEEE Access.

[20]  M. Abdullah-Al-Wadud,et al.  Facial Expression Recognition From Depth Video With Patterns of Oriented Motion Flow , 2017, IEEE Access.

[21]  Mubarak Shah,et al.  Recognizing 50 human action categories of web videos , 2012, Machine Vision and Applications.

[22]  Weishan Zhang,et al.  A Distributed Video Management Cloud Platform Using Hadoop , 2015, IEEE Access.

[23]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[24]  Qinkun Xiao,et al.  Action recognition based on hierarchical dynamic Bayesian network , 2018, Multimedia Tools and Applications.

[25]  Ling Shao,et al.  Human Action Recognition Using LBP-TOP as Sparse Spatio-Temporal Feature Descriptor , 2009, CAIP.

[26]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[27]  Silvio Savarese,et al.  Action Recognition by Hierarchical Mid-Level Action Elements , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[29]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[32]  Myoungjin Kim,et al.  Towards Efficient Design and Implementation of a Hadoop-based Distributed Video Transcoding System in Cloud Computing Environment , 2013 .

[33]  Zheru Chi,et al.  Facial Expression Recognition in Video with Multiple Feature Fusion , 2018, IEEE Transactions on Affective Computing.

[34]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[35]  Limin Wang,et al.  Mining Motion Atoms and Phrases for Complex Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.