Structural Laplacian Eigenmaps for Modeling Sets of Multivariate Sequences

A novel embedding-based dimensionality reduction approach, called structural Laplacian Eigenmaps, is proposed to learn models representing any concept that can be defined by a set of multivariate sequences. This approach relies on the expression of the intrinsic structure of the multivariate sequences in the form of structural constraints, which are imposed on dimensionality reduction process to generate a compact and data-driven manifold in a low dimensional space. This manifold is a mathematical representation of the intrinsic nature of the concept of interest regardless of the stylistic variability found in its instances. In addition, this approach is extended to model jointly several related concepts within a unified representation creating a continuous space between concept manifolds. Since a generated manifold encodes the unique characteristic of the concept of interest, it can be employed for classification of unknown instances of concepts. Exhaustive experimental evaluation on different datasets confirms the superiority of the proposed methodology to other state-of-the-art dimensionality reduction methods. Finally, the practical value of this novel dimensionality reduction method is demonstrated in three challenging computer vision applications, i.e., view-dependent and view-independent action recognition as well as human-human interaction classification.

[1]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[2]  Du Tran,et al.  Human Activity Recognition with Metric Learning , 2008, ECCV.

[3]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[4]  Farshad Almasganj,et al.  Using Laplacian eigenmaps latent variable model and manifold learning to improve speech recognition accuracy , 2010, Speech Commun..

[5]  Karl H.E. Kroemer,et al.  Anthropometry and biomechanics : theory and application , 1982 .

[6]  Aaron Hertzmann,et al.  Style-based inverse kinematics , 2004, ACM Trans. Graph..

[7]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[8]  Mubarak Shah,et al.  Learning human actions via information maximization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[10]  David J. Fleet,et al.  Gaussian Process Dynamical Models , 2005, NIPS.

[11]  David J. Fleet,et al.  Topologically-constrained latent variable models , 2008, ICML '08.

[12]  Rémi Ronfard,et al.  Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Maja J. Mataric,et al.  A spatio-temporal extension to Isomap nonlinear dimension reduction , 2004, ICML.

[14]  Geoffrey E. Hinton,et al.  GTM through time , 1997 .

[15]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[16]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[17]  Patrick Pérez,et al.  Cross-View Action Recognition from Temporal Self-similarities , 2008, ECCV.

[18]  Jean-Christophe Nebel,et al.  Automatic configuration of spectral dimensionality reduction methods , 2010, Pattern Recognit. Lett..

[19]  Jean-Christophe Nebel,et al.  Exploiting Human Bipedal Motion Constraints for 3D Pose Recovery from a Single Uncalibrated Camera , 2009, VISAPP.

[20]  Mubarak Shah,et al.  Incremental action recognition using feature-tree , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Richard Souvenir,et al.  Viewpoint Manifolds for Action Recognition , 2009, EURASIP J. Image Video Process..

[22]  Ahmed M. Elgammal,et al.  Homeomorphic Manifold Analysis: Learning Decomposable Generative Models for Human Motion Analysis , 2006, WDV.

[23]  Rita Cucchiara,et al.  HMM Based Action Recognition with Projection Histogram Features , 2010, ICPR Contests.

[24]  A. Elgammal,et al.  Separating style and content on a nonlinear manifold , 2004, CVPR 2004.

[25]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, CVPR.

[26]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[27]  David J. Fleet,et al.  Multifactor Gaussian process models for style-content separation , 2007, ICML '07.

[28]  Liang Wang,et al.  Visual learning and recognition of sequential data manifolds with applications to human movement analysis , 2008, Comput. Vis. Image Underst..

[29]  Sergio A. Velastin,et al.  Recognizing Human Actions Using Silhouette-based HMM , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[30]  Rajesh P. N. Rao,et al.  Learning Shared Latent Structure for Image Synthesis and Robotic Imitation , 2005, NIPS.

[31]  Ahmed M. Elgammal,et al.  Tracking People on a Torus , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Francesco Camastra,et al.  Data dimensionality estimation methods: a survey , 2003, Pattern Recognit..

[33]  Jean-Christophe Nebel,et al.  Graph-based Particle Filter for Human Tracking with Stylistic Variations , 2011, BMVC.

[34]  张振跃,et al.  Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment , 2004 .

[35]  Ramakant Nevatia,et al.  Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Karl H.E. Kroemer,et al.  Anthropometry and Biomechanics , 1982 .

[37]  Dit-Yan Yeung,et al.  Human action recognition using Local Spatio-Temporal Discriminant Embedding , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[39]  Joaquin Quiñonero Candela,et al.  Local distance preservation in the GP-LVM through back constraints , 2006, ICML.

[40]  John Darby,et al.  Tracking human pose with multiple activity models , 2010, Pattern Recognit..

[41]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[42]  Pascal Fua,et al.  Making Action Recognition Robust to Occlusions and Viewpoint Changes , 2010, ECCV.

[43]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[44]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[45]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Shaogang Gong,et al.  Action categorization with modified hidden conditional random field , 2010, Pattern Recognit..

[47]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[48]  Xuelong Li,et al.  Patch Alignment for Dimensionality Reduction , 2009, IEEE Transactions on Knowledge and Data Engineering.

[49]  David J. Fleet,et al.  3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[50]  Mubarak Shah,et al.  Learning 4D action feature models for arbitrary view action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Ahmed M. Elgammal,et al.  Learning a Joint Manifold Representation from Multiple Data Sets , 2010, 2010 20th International Conference on Pattern Recognition.

[52]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[53]  Jean-Christophe Nebel,et al.  Temporal Extension of Laplacian Eigenmaps for Unsupervised Dimensionality Reduction of Time Series , 2010, 2010 20th International Conference on Pattern Recognition.

[54]  Luc Van Gool,et al.  Variations of a Hough-Voting Action Recognition System , 2010, ICPR Contests.

[55]  Mubarak Shah,et al.  Recognizing human actions using multiple features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[57]  François Brémond,et al.  Gesture recognition by learning local motion signatures , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[58]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[59]  Jean-Christophe Nebel,et al.  View and Style-Independent Action Manifolds for Human Activity Recognition , 2010, ECCV.

[60]  Sridhar Mahadevan,et al.  Manifold alignment using Procrustes analysis , 2008, ICML '08.

[61]  Jake K. Aggarwal,et al.  An Overview of Contest on Semantic Description of Human Activities (SDHA) 2010 , 2010, ICPR Contests.

[62]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[63]  Trevor Darrell,et al.  Discriminative Gaussian process latent variable model for classification , 2007, ICML '07.

[64]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[65]  W. Arnoldi The principle of minimized iterations in the solution of the matrix eigenvalue problem , 1951 .

[66]  Liang Wang,et al.  Extrapolating Learned Manifolds for Human Activity Recognition , 2007, 2007 IEEE International Conference on Image Processing.