Learning an internal representation of a deep convolutional neural network model for pose-based human action recognition

A Deep Neural Networks (DNN) which consist of multi-layered Convolutional neural networks (CNNs) in these days has successfully extracted most relevant features hierarchically for recognizing visual objects. Moreover, they are learned positional-invariant feature by itself from dataset due to a subset of connection weight is shared. They also have been used to extract spatiotemporal features of time-series for an audio classification.