论文信息 - Learning to Filter with Predictive State Inference Machines - 字舞流文

Learning to Filter with Predictive State Inference Machines

Latent state space models are a fundamental and widely used tool for modeling dynamical systems. However, they are difficult to learn from data and learned models often lack performance guarantees on inference tasks such as filtering and prediction. In this work, we present the PREDICTIVE STATE INFERENCE MACHINE (PSIM), a data-driven method that considers the inference procedure on a dynamical system as a composition of predictors. The key idea is that rather than first learning a latent state space model, and then using the learned model for inference, PSIM directly learns predictors for inference in predictive state space. We provide theoretical guarantees for inference, in both realizable and agnostic settings, and showcase practical performance on a variety of simulated and real world robotics benchmarks.

Byron Boots | J. Andrew Bagnell | Arun Venkatraman | Wen Sun | J. Bagnell | Arun Venkatraman | Byron Boots | Wen Sun

[1] Jonathan D. Cryer,et al. Time Series Analysis , 1986, Encyclopedia of Big Data.

[2] Zoubin Ghahramani,et al. Learning Nonlinear Dynamical Systems Using an EM Algorithm , 1998, NIPS.

[3] Zoubin Ghahramani,et al. A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[4] Herbert Jaeger,et al. Observable Operator Models for Discrete Stochastic Time Series , 2000, Neural Computation.

[5] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.

[6] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[7] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.

[8] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.

[9] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.

[10] Xin Guo,et al. On the optimality of conditional expectation as a Bregman predictor , 2005, IEEE Trans. Inf. Theory.

[11] Byron Boots,et al. A Constraint Generation Approach to Learning Stable Linear Dynamical Systems , 2007, NIPS.

[12] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[13] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[14] Nuno Vasconcelos,et al. Classifying Video with Kernel Dynamic Textures , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Sham M. Kakade,et al. Mind the Duality Gap: Logarithmic regret algorithms for online optimization , 2008, NIPS.

[16] Pieter Abbeel,et al. Learning for control from multiple demonstrations , 2008, ICML '08.

[17] John Langford,et al. Learning nonlinear dynamic models , 2009, ICML '09.

[18] Zhuowen Tu,et al. Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Nathan Srebro,et al. Optimistic Rates for Learning with a Smooth Loss , 2010, 1009.3896.

[20] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[21] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.

[22] J. Bagnell,et al. Learning Deep Inference Machines , 2010 .

[23] Byron Boots,et al. Predictive State Temporal Difference Learning , 2010, NIPS.

[24] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..

[25] J. Andrew Bagnell,et al. Stability Conditions for Online Learnability , 2011, ArXiv.

[26] Martial Hebert,et al. Learning message-passing inference machines for structured prediction , 2011, CVPR 2011.

[27] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[28] Bart De Moor,et al. Subspace Identification for Linear Systems: Theory ― Implementation ― Applications , 2011 .

[29] Elad Hazan,et al. Projection-free Online Learning , 2012, ICML.

[30] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.

[31] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[32] Byron Boots,et al. Spectral Approaches to Learning Predictive Representations , 2011 .

[33] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[34] Varun Ramakrishna,et al. Pose Machines: Articulated Pose Estimation via Inference Machines , 2014, ECCV.

[35] Alex Kulesza,et al. Low-Rank Spectral Learning , 2014, AISTATS.

[36] Ian D. Reid,et al. Deeply Learning the Messages in Message Passing Inference , 2015, NIPS.

[37] Dean Alderucci. A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[38] Martial Hebert,et al. Improving Multi-Step Prediction of Learned Time Series Models , 2015, AAAI.

[39] Geoffrey J. Gordon,et al. Supervised Learning for Dynamical System Learning , 2015, NIPS.

[40] Byron Boots,et al. Online Instrumental Variable Regression with Applications to Online Linear System Identification , 2016, AAAI.