论文信息 - Dantzig Selector with an Approximately Optimal Denoising Matrix and its Application to Reinforcement Learning

Dantzig Selector with an Approximately Optimal Denoising Matrix and its Application to Reinforcement Learning

Dantzig Selector (DS) is widely used in compressed sensing and sparse learning for feature selection and sparse signal recovery. Since the DS formulation is essentially a linear programming optimization, many existing linear programming solvers can be simply applied for scaling up. The DS formulation can be explained as a basis pursuit denoising problem, wherein the data matrix (or measurement matrix) is employed as the denoising matrix to eliminate the observation noise. However, we notice that the data matrix may not be the optimal denoising matrix, as shown by a simple counter-example. This motivates us to pursue a better denoising matrix for defining a general DS formulation. We first define the optimal denoising matrix through a minimax optimization, which turns out to be an NPhard problem. To make the problem computationally tractable, we propose a novel algorithm, termed as Optimal Denoising Dantzig Selector (ODDS), to approximately estimate the optimal denoising matrix. Empirical experiments validate the proposed method. Finally, a novel sparse reinforcement learning algorithm is formulated by extending the proposed ODDS algorithm to temporal difference learning, and empirical experimental results demonstrate to outperform the conventional vanilla DS-TD algorithm.

[1] E. Candès. The restricted isometry property and its implications for compressed sensing , 2008 .

[2] D. Donoho,et al. Basis pursuit , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[3] Bo Liu,et al. Sparse Q-learning with Mirror Descent , 2012, UAI.

[4] Junsong Yuan,et al. Sparse reconstruction cost for abnormal event detection , 2011, CVPR 2011.

[5] S. Geer,et al. On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[6] Emmanuel J. Candès,et al. Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[7] P. Bickel,et al. SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[8] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[9] Terence Tao,et al. The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[10] Bruno Scherrer,et al. Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view , 2010, ICML.

[11] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[12] Jieping Ye,et al. Forward-Backward Greedy Algorithms for General Convex Smooth Functions over A Cardinality Constraint , 2013, ICML.

[13] Bo Liu,et al. Regularized Off-Policy TD-Learning , 2012, NIPS.

[14] Tong Zhang,et al. On the Consistency of Feature Selection using Greedy Least Squares Regression , 2009, J. Mach. Learn. Res..

[15] Emmanuel J. Candès,et al. Templates for convex cone problems with applications to sparse signal recovery , 2010, Math. Program. Comput..

[16] Han Liu,et al. The Group Dantzig Selector , 2010, AISTATS.

[17] Jie Liu. Statistical Methods for Genome-wide Association Studies and Personalized Medicine , 2014 .

[18] Arindam Banerjee,et al. Generalized Dantzig Selector: Application to the k-support norm , 2014, NIPS.

[19] Yong Zhang,et al. An alternating direction method for finding Dantzig selectors , 2010, Comput. Stat. Data Anal..

[20] Tong Zhang,et al. Sparse Recovery With Orthogonal Matching Pursuit Under RIP , 2010, IEEE Transactions on Information Theory.

[21] Matthieu Geist,et al. A Dantzig Selector Approach to Temporal Difference Learning , 2012, ICML.

[22] Dimitri P. Bertsekas,et al. Error Bounds for Approximations from Projected Linear Equations , 2010, Math. Oper. Res..

[23] Xiangfeng Wang,et al. The Linearized Alternating Direction Method of Multipliers for Dantzig Selector , 2012, SIAM J. Sci. Comput..

[24] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[25] David L Donoho,et al. Compressed sensing , 2006, IEEE Transactions on Information Theory.

[26] Zhiwei Qin,et al. Sparse Reinforcement Learning via Convex Optimization , 2014, ICML.

[27] Jieping Ye,et al. Multi-Stage Dantzig Selector , 2010, NIPS.

[28] Joel A. Tropp,et al. Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[29] Stephen P. Boyd,et al. Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[30] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[31] Tong Zhang,et al. Adaptive Forward-Backward Greedy Algorithm for Learning Sparse Representations , 2011, IEEE Transactions on Information Theory.