A methodology for image annotation of human actions in videos

In the context of video-based image classification, image annotation plays a vital role in improving the image classification decision based on it’s semantics. Though, several methods have been introduced to adopt the image annotation such as manual and semi-supervised. However, formal specification, high cost, high probability of errors and computation time remain major issues to perform image annotation. In order to overcome these issues, we propose a new image annotation technique which consists of three tiers namely frames extraction, interest point’s generation, and clustering. The aim of the proposed technique is to automate the label generation of video frames. Moreover, an evaluation model to assess the effectiveness of the proposed technique is used. The promising results of the proposed technique indicate the effectiveness (77% in terms of Adjusted Random Index) of the proposed technique in the context label generation for video frames. In the end, a comparative study analysis is made between the existing techniques and proposed methodology.

[1]  Christoph Meinel,et al.  Exploring multimodal video representation for action recognition , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[2]  Norimichi Ukita,et al.  Semi- and weakly-supervised human pose estimation , 2018, Comput. Vis. Image Underst..

[3]  Bo Zhao,et al.  Bag of Events: An Efficient Probability-Based Feature Extraction Method for AER Image Sensors , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Simone Palazzo,et al.  An innovative web-based collaborative platform for video annotation , 2014, Multimedia Tools and Applications.

[5]  Zhongzhi Shi Image Semantic Analysis and Understanding , 2010, Intelligent Information Processing.

[6]  Mustafa Jahangoshai Rezaee,et al.  Integrating dynamic fuzzy C-means, data envelopment analysis and artificial neural network to online prediction performance of companies in stock exchange , 2018 .

[7]  Athman Bouguettaya,et al.  Efficient agglomerative hierarchical clustering , 2015, Expert Syst. Appl..

[8]  Anita Shinde,et al.  A Study on Image Annotation Techniques , 2012 .

[9]  Douglas Steinley,et al.  A note on the expected value of the Rand index , 2018, The British journal of mathematical and statistical psychology.

[10]  Mengjie Zhang,et al.  Genetic Programming for Automatic Global and Local Feature Extraction to Image Classification , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[11]  Dinggang Shen,et al.  Voxel Deconvolutional Networks for 3D Brain Image Labeling , 2018, KDD.

[12]  Di Wu,et al.  Power Network Equivalents: A Network Science Based K-Means Clustering Method Integrated with Silhouette Analysis , 2017, COMPLEX NETWORKS.

[13]  Richard Gerum,et al.  ClickPoints: an expandable toolbox for scientific image annotation and analysis , 2017 .

[14]  Christoph Meinel,et al.  Real-Time Action Recognition in Surveillance Videos Using ConvNets , 2016, ICONIP.

[15]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[16]  Du Tran,et al.  Human Activity Recognition with Metric Learning , 2008, ECCV.

[17]  Nicolás Guil Mata,et al.  Improving Bag-of-Visual-Words model using visual n-grams for human action classification , 2018, Expert Syst. Appl..

[18]  Grzegorz Sarwas,et al.  FSIFT based feature points for face hierarchical clustering , 2018, 2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA).

[19]  Hae-Sang Park,et al.  A simple and fast algorithm for K-medoids clustering , 2009, Expert Syst. Appl..

[20]  Christoph Meinel,et al.  Action Recognition in Surveillance Video Using ConvNets and Motion History Image , 2016, ICANN.

[21]  V. B. Nemirovskiy,et al.  CLUSTERING FACE IMAGES , 2017 .

[22]  Anne L. Martel,et al.  A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification , 2018, Scientific Reports.

[23]  Vishal Lonarkar,et al.  Content-based image retrieval by segmentation and clustering , 2017, 2017 International Conference on Inventive Computing and Informatics (ICICI).

[24]  Björn W. Schuller,et al.  Applying Cooperative Machine Learning to Speed Up the Annotation of Social Signals in Large Multi-modal Corpora , 2018, ArXiv.