A Latent Manifold Markovian Dynamics Gaussian Process

In this paper, we propose a Gaussian process (GP) model for analysis of nonlinear time series. Formulation of our model is based on the consideration that the observed data are functions of latent variables, with the associated mapping between observations and latent representations modeled through GP priors. In addition, to capture the temporal dynamics in the modeled data, we assume that subsequent latent representations depend on each other on the basis of a hidden Markov prior imposed over them. Derivation of our model is performed by marginalizing out the model parameters in closed form using GP priors for observation mappings, and appropriate stick-breaking priors for the latent variable (Markovian) dynamics. This way, we eventually obtain a nonparametric Bayesian model for dynamical systems that accounts for uncertainty in the modeled data. We provide efficient inference algorithms for our model on the basis of a truncated variational Bayesian approximation. We demonstrate the efficacy of our approach considering a number of applications dealing with real-world data, and compare it with the related state-of-the-art approaches.

[1]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[2]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[3]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[4]  J. Zhao,et al.  Probabilistic PCA for t distributions , 2006, Neurocomputing.

[5]  Daniel P. W. Ellis,et al.  A tempo-insensitive distance measure for cover song identification based on chroma features , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Fernando A. Quintana,et al.  Nonparametric Bayesian data analysis , 2004 .

[7]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[8]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[9]  Michael I. Jordan,et al.  Variational methods for the Dirichlet process , 2004, ICML.

[10]  Lawrence Carin,et al.  Music Analysis Using Hidden Markov Mixture Models , 2007, IEEE Transactions on Signal Processing.

[11]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[12]  Sotirios Chatzis,et al.  Robust Visual Behavior Recognition , 2010, IEEE Signal Processing Magazine.

[13]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[14]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[15]  Sotirios Chatzis,et al.  Robust Sequential Data Modeling Using an Outlier Tolerant Hidden Markov Model , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Remo Guidieri Res , 1995, RES: Anthropology and Aesthetics.

[17]  Mandy Berg,et al.  Moment Functions In Image Analysis Theory And Applications , 2016 .

[18]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[19]  Lawrence Carin,et al.  Hidden Markov Models With Stick-Breaking Priors , 2009, IEEE Transactions on Signal Processing.

[20]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[21]  David J. Fleet,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .

[22]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[23]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[24]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[25]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[26]  James M. Rehg,et al.  Learning and Inferring Motion Patterns using Parametric Segmental Switching Linear Dynamic Systems , 2008, International Journal of Computer Vision.

[27]  Theodora A. Varvarigou,et al.  A Threefold Dataset for Activity and Workflow Recognition in Complex Industrial Environments , 2012, IEEE MultiMedia.

[28]  Jasper Snoek,et al.  Nonparametric guidance of autoencoder representations using label information , 2012, J. Mach. Learn. Res..

[29]  Michael I. Jordan,et al.  Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.

[30]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[31]  G. H. Wakefield,et al.  To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[32]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[33]  D. Chandler,et al.  Introduction To Modern Statistical Mechanics , 1987 .