Indian Classical Dance Action Identification and Classification with Convolutional Neural Networks

Extracting and recognizing complex human movements from unconstrained online/offline video sequence is a challenging task in computer vision. This paper proposes the classification of Indian classical dance actions using a powerful artificial intelligence tool: convolutional neural networks (CNN). In this work, human action recognition on Indian classical dance videos is performed on recordings from both offline (controlled recording) and online (live performances, YouTube) data. The offline data is created with ten different subjects performing 200 familiar dance mudras/poses from different Indian classical dance forms under various background environments. The online dance data is collected from YouTube for ten different subjects. Each dance pose is occupied for 60 frames or images in a video in both the cases. CNN training is performed with 8 different sample sizes, each consisting of multiple sets of subjects. The remaining 2 samples are used for testing the trained CNN. Different CNN architectures were designed and tested with our data to obtain a better accuracy in recognition. We achieved a 93.33% recognition rate compared to other classifier models reported on the same dataset.

[1]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[2]  Bhabatosh Chanda,et al.  Indian Classical Dance classification by learning dance pose bases , 2012, 2012 IEEE Workshop on the Applications of Computer Vision (WACV).

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Daijin Kim,et al.  Robust human activity recognition from depth video using spatiotemporal multi-fused features , 2017, Pattern Recognit..

[5]  Rajiv Ranjan Sahay,et al.  Nrityabodha: Towards understanding Indian classical dance using a deep learning approach , 2016, Signal Process. Image Commun..

[6]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[7]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[8]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[9]  K. V. V. Kumar,et al.  Indian Classical Dance Mudra Classification Using HOG Features and SVM Classifier , 2017 .

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  James Zijun Wang,et al.  RAPID: Rating Pictorial Aesthetics using Deep Learning , 2014, ACM Multimedia.

[12]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Yi Li,et al.  Beyond Physical Connections: Tree Models in Human Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  K. V. V. Kumar,et al.  Indian Classical Dance Classification with Adaboost Multiclass Classifier on Multifeature Fusion , 2017 .

[15]  D. Anil Kumar,et al.  Optical Flow Hand Tracking and Active Contour Hand Shape Features for Continuous Sign Language Recognition with Artificial Neural Networks , 2015, 2016 IEEE 6th International Conference on Advanced Computing (IACC).

[16]  Soharab Hossain Shaikh,et al.  A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector , 2015, The Visual Computer.

[17]  Hong Qiao,et al.  Point correspondence by a new third order graph matching algorithm , 2017, Pattern Recognit..

[18]  P. V. V. Kishore,et al.  NEURAL NETWORK CLASSIFIER FOR CONTINUOUS SIGN LANGUAGE RECOGNITION WITH SELFIE VIDEO , 2017 .

[19]  Houqiang Li,et al.  Photo Quality Assessment with DCNN that Understands Image Well , 2015, MMM.

[20]  P. V. V. Kishore,et al.  Selfie video based continuous Indian sign language recognition system , 2017, Ain Shams Engineering Journal.

[21]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[22]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Ajmal Mian,et al.  Learning a Deep Model for Human Action Recognition from Novel Viewpoints , 2016 .

[24]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[25]  Cordelia Schmid,et al.  Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[27]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[28]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Daniel Cremers,et al.  Dense Non-rigid Shape Correspondence Using Random Forests , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[31]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Xinmei Tian,et al.  Multi-level photo quality assessment with multi-view features , 2015, Neurocomputing.