Feature Extraction and Pattern Recognition for Human Motion by a Deep Sparse Autoencoder

Human motion data is high-dimensional time-series data, and it usually contains measurement error and noise. Recognizing human motion on the basis of such high-dimensional measurement row data is often difficult and cannot be expected for high generalization performance. To increase generalization performance in a human motion pattern recognition task, we employ a deep sparse auto encoder to extract low-dimensional features, which can efficiently represent the characteristics of each motion, from the high-dimensional human motion data. After extracting low-dimensional features by using the deep sparse auto encoder, we employ random forests to classify low-dimensional features representing human motion. In experiments, we compared using the row data and three types of feature extraction methods - principal component analysis, a shallow sparse auto encoder, and a deep sparse auto encoder - for pattern recognition. The experimental results show that the deep sparse auto encoder outperformed the other methods with the highest average recognition accuracy, 75.1%, and the lowest standard deviation, ±3.30%. The proposed method, application of a deep sparse auto encoder, thus enabled higher recognition accuracy, better generalization and more stability than could be achieved with the other methods.

[1]  Tatsuya Hirose,et al.  Abstraction Multimodal Low-Dimensional Representation from High-Dimensional Posture Information and Visual Images , 2013, J. Robotics Mechatronics.

[2]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[3]  Michael I. Jordan,et al.  Sharing Features among Dynamical Systems with Beta Processes , 2009, NIPS.

[4]  Christian Wolf,et al.  Sequential Deep Learning for Human Action Recognition , 2011, HBU.

[5]  Johan Håstad,et al.  On the power of small-depth threshold circuits , 1991, computational complexity.

[6]  Naoto Iwahashi,et al.  Unsupervised Segmentation of Human Motion Data Using a Sticky Hierarchical Dirichlet Process-Hidden Markov Model and Minimal Description Length-Based Chunking Method for Imitation Learning , 2011, Adv. Robotics.

[7]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[8]  B. Schölkopf,et al.  Modeling Human Motion Using Binary Latent Variables , 2007 .

[9]  Keita Hamahata,et al.  Unsupervised Segmentation of Human Motion Data Using Sticky HDP-HMM and MDL-based Chunking Method for Imitation Learning , 2011 .

[10]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  David J. Fleet,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .

[13]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[14]  M WangJack,et al.  Gaussian Process Dynamical Models for Human Motion , 2008 .

[15]  Tadahiro Taniguchi,et al.  Double articulation analyzer for unsegmented human motion using Pitman-Yor language model and infinite hidden Markov model , 2011, 2011 IEEE/SICE International Symposium on System Integration (SII).

[16]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[17]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..