Learning to Smooth with Bidirectional Predictive State Inference Machines

We present the Smoothing Machine (SMACH, pronounced "smash"), a dynamical system learning algorithm based on chain Conditional Random Fields (CRFs) with latent states. Unlike previous methods, SMACH is designed to optimize prediction performance when we have information from both past and future observations. By leveraging Predictive State Representations (PSRs), we model beliefs about latent states through predictive states—an alternative but equivalent representation that depends directly on observable quantities. Predictive states enable the use of well-developed supervised learning approaches in place of local-optimum-prone methods like EM: we learn regressors or classifiers that can approximate message passing and marginalization in the space of predictive states. We provide theoretical guarantees on smoothing performance and we empirically verify the efficacy of SMACH on several dynamical system benchmarks.

[1]  Pieter Abbeel,et al.  Learning vehicular dynamics, with application to modeling helicopters , 2005, NIPS.

[2]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[3]  Geoffrey J. Gordon,et al.  Supervised Learning for Dynamical System Learning , 2015, NIPS.

[4]  Byron Boots,et al.  Inference Machines for Nonparametric Filter Learning , 2016, IJCAI.

[5]  Bradley M. Bell,et al.  The Iterated Kalman Smoother as a Gauss-Newton Method , 1994, SIAM J. Optim..

[6]  Carl E. Rasmussen,et al.  Robust Filtering and Smoothing with Gaussian Processes , 2012, IEEE Transactions on Automatic Control.

[7]  Guilherme Hoefel Learning a two-stage SVM/CRF sequence classifier , 2008, CIKM '08.

[8]  Geoffrey J. Gordon,et al.  No-Regret Reductions for Imitation Learning and Structured Prediction , 2010, ArXiv.

[9]  Thierry Artières,et al.  Neural conditional random fields , 2010, AISTATS.

[10]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[11]  Le Song,et al.  The Nonparametric Kernel Bayes Smoother , 2016, AISTATS.

[12]  J. Andrew Bagnell,et al.  Efficient Reductions for Imitation Learning , 2010, AISTATS.

[13]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[14]  Herbert Jaeger,et al.  Observable Operator Models for Discrete Stochastic Time Series , 2000, Neural Computation.

[15]  Dieter Fox,et al.  GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Max Welling,et al.  Hidden-Unit Conditional Random Fields , 2011, AISTATS.

[17]  Byron Boots,et al.  Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..

[18]  Martial Hebert,et al.  Learning message-passing inference machines for structured prediction , 2011, CVPR 2011.

[19]  Le Song,et al.  Hilbert Space Embeddings of Hidden Markov Models , 2010, ICML.

[20]  Alexander M. Rush,et al.  Spectral Learning of Refinement HMMs , 2013, CoNLL.

[21]  Yunsong Guo,et al.  Comparisons of sequence labeling algorithms and extensions , 2007, ICML '07.

[22]  William D. Smart,et al.  Receding Horizon Differential Dynamic Programming , 2007, NIPS.

[23]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[24]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[25]  Byron Boots,et al.  Learning to Filter with Predictive State Inference Machines , 2015, ICML.

[26]  Byron Boots,et al.  Hilbert Space Embeddings of Predictive State Representations , 2013, UAI.

[27]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[28]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[29]  Byron Boots,et al.  Spectral Approaches to Learning Predictive Representations , 2011 .

[30]  Alborz Geramifard,et al.  RLPy: a value-function-based reinforcement learning framework for education and research , 2015, J. Mach. Learn. Res..

[31]  Alex Kulesza,et al.  Low-Rank Spectral Learning , 2014, AISTATS.