Value function approximation via low-rank models

We propose a novel value function approximation technique for Markov decision processes. We consider the problem of compactly representing the state-action value function using a low-rank and sparse matrix model. The problem is to decompose a matrix that encodes the true value function into low-rank and sparse components, and we achieve this using Robust Principal Component Analysis (PCA). Under minimal assumptions, this Robust PCA problem can be solved exactly via the Principal Component Pursuit convex optimization problem. We experiment the procedure on several examples and demonstrate that our method yields approximations essentially identical to the true function.

[1]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[2]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[3]  Arvind Ganesh,et al.  Fast Convex Optimization Algorithms for Exact Recovery of a Corrupted Low-Rank Matrix , 2009 .

[4]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[5]  Alan Edelman,et al.  Julia: A Fast Dynamic Language for Technical Computing , 2012, ArXiv.

[6]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[7]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[8]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[9]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[12]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[13]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[14]  Xiaoming Yuan,et al.  Sparse and low-rank matrix decomposition via alternating direction method , 2013 .

[15]  Shiqian Ma,et al.  Fixed point and Bregman iterative methods for matrix rank minimization , 2009, Math. Program..

[16]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[17]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[18]  Wotao Yin,et al.  An Iterative Regularization Method for Total Variation-Based Image Restoration , 2005, Multiscale Model. Simul..

[19]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[20]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[21]  Jeffrey K. Uhlmann,et al.  Unscented filtering and nonlinear estimation , 2004, Proceedings of the IEEE.

[22]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, ISIT.

[23]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[24]  Shiqian Ma,et al.  Convergence of Fixed-Point Continuation Algorithms for Matrix Rank Minimization , 2009, Found. Comput. Math..

[25]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..