Statistical linear estimation with penalized estimators: an application to reinforcement learning
暂无分享,去创建一个
[1] Bin Yu. RATES OF CONVERGENCE FOR EMPIRICAL PROCESSES OF STATIONARY MIXING SEQUENCES , 1994 .
[2] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[3] A. Kirsch. An Introduction to the Mathematical Theory of Inverse Problems , 1996, Applied Mathematical Sciences.
[4] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] Paul-Marie Samson,et al. Concentration of measure inequalities for Markov chains and $\Phi$-mixing processes , 2000 .
[7] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[8] L. Rosasco. Regularization approaches in learning theory , 2006 .
[9] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[10] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[11] Lorenzo Rosasco,et al. Learning from Examples as an Inverse Problem , 2005, J. Mach. Learn. Res..
[12] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[13] Andreas Christmann,et al. Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.
[14] Shie Mannor,et al. Regularized Policy Iteration , 2008, NIPS.
[15] Rémi Munos,et al. Compressed Least-Squares Regression , 2009, NIPS.
[16] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[17] P. Bickel,et al. SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.
[18] A. Belloni,et al. Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2010, 1009.5689.
[19] Bruno Scherrer,et al. Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view , 2010, ICML.
[20] Dimitri P. Bertsekas,et al. Error Bounds for Approximations from Projected Linear Equations , 2010, Math. Oper. Res..
[21] Alessandro Lazaric,et al. Finite-Sample Analysis of LSTD , 2010, ICML.
[22] Alessandro Lazaric,et al. LSTD with Random Projections , 2010, NIPS.
[23] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[24] Odalric-Ambrym Maillard,et al. (APPRENTISSAGE SÉQUENTIEL : Bandits, Statistique et Renforcement , 2011 .
[25] Matthew W. Hoffman,et al. Finite-Sample Analysis of Lasso-TD , 2011, ICML.
[26] V. Koltchinskii,et al. Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .
[27] Gilles Stoltz,et al. Inverse problems and high dimensional estimation , 2011 .
[28] Sara van de Geer,et al. Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .
[29] A. Belloni,et al. Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2011 .
[30] Csaba Szepesvari,et al. Regularized least-squares regression: Learning from a β-mixing sequence , 2012 .
[31] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.