BLACK BOX VARIATIONAL INFERENCE FOR STATE SPACE MODELS

Latent variable time-series models are among the most heavily used tools from machine learning and applied statistics. These models have the advantage of learning latent structure both from noisy observations and from the temporal ordering in the data, where it is assumed that meaningful correlation structure exists across time. A few highly-structured models, such as the linear dynamical system with linear-Gaussian observations, have closed-form inference procedures (e.g. the Kalman Filter), but this case is an exception to the general rule that exact posterior inference in more complex generative models is intractable. Consequently, much work in time-series modeling focuses on approximate inference procedures for one particular class of models. Here, we extend recent developments in stochastic variational inference to develop a `black-box' approximate inference technique for latent variable models with latent dynamical structure. We propose a structured Gaussian variational approximate posterior that carries the same intuition as the standard Kalman filter-smoother but, importantly, permits us to use the same inference approach to approximate the posterior of much more general, nonlinear latent variable generative models. We show that our approach recovers accurate estimates in the case of basic models with closed-form posteriors, and more interestingly performs well in comparison to variational approaches that were designed in a bespoke fashion for specific non-conjugate models.

[1]  L. Fahrmeir,et al.  On kalman filtering, posterior mode estimation and fisher scoring in dynamic exponential family regression , 1991 .

[2]  J. Navarro-Pedreño Numerical Methods for Least Squares Problems , 1996 .

[3]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[4]  Cheng-Kok Koh,et al.  Numerically Stable Algorithms for Inversion of Block Tridiagonal and Banded Matrices , 2007 .

[5]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[6]  Wei Wu,et al.  A new look at state-space models for neural data , 2010, Journal of Computational Neuroscience.

[7]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[8]  John P. Cunningham,et al.  Empirical models of spiking in neural populations , 2011, NIPS.

[9]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[10]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[11]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[12]  Maneesh Sahani,et al.  Spectral learning of linear dynamics from generalised-linear observations with application to neural population data , 2012, NIPS.

[13]  Mohammad Emtiyaz Khan,et al.  Fast Dual Variational Inference for Non-Conjugate Latent Gaussian Models , 2013, ICML.

[14]  David Pfau,et al.  Robust learning of low-dimensional dynamics from large neural ensembles , 2013, NIPS.

[15]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[16]  John P. Cunningham,et al.  Clustered factor analysis of multineuronal spike data , 2014, NIPS.

[17]  Miguel Lázaro-Gredilla,et al.  Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[18]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[19]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[20]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[21]  Zhe Gan,et al.  Deep Temporal Sigmoid Belief Networks for Sequence Modeling , 2015, NIPS.

[22]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[23]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[24]  John P. Cunningham,et al.  Single-trial dynamics of motor cortex and their applications to brain-machine interfaces , 2015, Nature Communications.

[25]  Uri Shalit,et al.  Deep Kalman Filters , 2015, ArXiv.

[26]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[28]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .