Predictive State Temporal Difference Learning
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[2] Lihong Li,et al. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.
[3] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[4] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[5] Sebastian Thrun,et al. Learning low dimensional predictive representations , 2004, ICML.
[6] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[7] Sham M. Kakade,et al. A spectral algorithm for learning Hidden Markov Models , 2008, J. Comput. Syst. Sci..
[8] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[9] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.
[10] John N. Tsitsiklis,et al. Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives , 1999, IEEE Trans. Autom. Control..
[11] Chang Wang,et al. Compact Spectral Bases for Value Function Approximation Using Kronecker Factorization , 2007, AAAI.
[12] David Choi,et al. A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning , 2001, Discret. Event Dyn. Syst..
[13] Sridhar Mahadevan,et al. Representation Policy Iteration , 2005, UAI.
[14] Jiming Liu,et al. A novel orthogonal NMF-based belief compression for POMDPs , 2007, ICML '07.
[15] Michael H. Bowling,et al. Learning predictive state representations using non-blind policies , 2006, ICML '06.
[16] Tohru Katayama,et al. Subspace Methods for System Identification , 2005 .
[17] Byron Boots,et al. Reduced-Rank Hidden Markov Models , 2009, AISTATS.
[18] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .
[19] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[20] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[21] Bart De Moor,et al. Subspace Identification for Linear Systems: Theory ― Implementation ― Applications , 2011 .
[22] Yishay Mansour,et al. Planning in POMDPs Using Multiplicity Automata , 2005, UAI.
[23] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[24] H. Hotelling. The most predictable criterion. , 1935 .
[25] Nikos A. Vlassis,et al. Improving Approximate Value Iteration Using Memories and Predictive State Representations , 2006, AAAI.
[26] Sridhar Mahadevan,et al. Samuel Meets Amarel: Automating Value Function Approximation Using Global State Space Analysis , 2005, AAAI.
[27] Sridhar Mahadevan,et al. Compressing POMDPs Using Locality Preserving Non-Negative Matrix Factorization , 2010, AAAI.
[28] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.
[29] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[30] G. Reinsel,et al. Multivariate Reduced-Rank Regression: Theory and Applications , 1998 .
[31] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2011, Int. J. Robotics Res..
[32] Stefano Soatto,et al. Dynamic Data Factorization , 2001 .
[33] Herbert Jaeger,et al. Observable Operator Models for Discrete Stochastic Time Series , 2000, Neural Computation.