暂无分享,去创建一个
Sham M. Kakade | Qi Lei | Baihe Huang | Runzhe Wang | Jason D. Lee | Jiaqi Yang | Kaixuan Huang | S. Kakade | Qi Lei | Jiaqi Yang | Baihe Huang | Kaixuan Huang | Runzhe Wang
[1] Tengyu Ma,et al. Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature , 2021, ArXiv.
[2] Nathan Srebro,et al. Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models , 2019, ICML.
[3] Zheng Wen,et al. Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization , 2013, Math. Oper. Res..
[4] John Langford,et al. Active Learning for Cost-Sensitive Classification , 2017, ICML.
[5] Colin Wei,et al. Shape Matters: Understanding the Implicit Bias of the Noise Covariance , 2020, COLT.
[6] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[7] Zheng Wen,et al. Efficient Exploration and Value Function Generalization in Deterministic Systems , 2013, NIPS.
[8] Ruosong Wang,et al. Provably Efficient Reinforcement Learning with General Value Function Approximation , 2020, ArXiv.
[9] Jan Vybíral,et al. Identification of Shallow Neural Networks by Fewest Samples , 2018, Information and Inference: A Journal of the IMA.
[10] Tengyu Ma,et al. Beyond Lazy Training for Over-parameterized Tensor Decomposition , 2020, NeurIPS.
[11] Tengyu Ma,et al. Learning One-hidden-layer Neural Networks with Landscape Design , 2017, ICLR.
[12] Nan Jiang,et al. Provably efficient RL with Rich Observations via Latent State Decoding , 2019, ICML.
[13] Emma Brunskill,et al. Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds , 2019, ICML.
[14] Chi Jin,et al. Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms , 2021, NeurIPS.
[15] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[16] Cho-Jui Hsieh,et al. Convergence of Adversarial Training in Overparametrized Networks , 2019, ArXiv.
[17] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.
[18] Sham M. Kakade,et al. Optimal Gradient-based Algorithms for Non-concave Bandit Optimization , 2021, NeurIPS.
[19] Nan Jiang,et al. On Oracle-Efficient PAC RL with Rich Observations , 2018, NeurIPS.
[20] Michael I. Jordan,et al. Provably Efficient Reinforcement Learning with Linear Function Approximation , 2019, COLT.
[21] Mengdi Wang,et al. Reinforcement Leaning in Feature Space: Matrix Bandit, Kernels, and Regret Bound , 2019, ICML.
[22] Nan Jiang,et al. Contextual Decision Processes with low Bellman rank are PAC-Learnable , 2016, ICML.
[23] Nathan Srebro,et al. Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy , 2020, NeurIPS.
[24] Nevena Lazic,et al. Exploration-Enhanced POLITEX , 2019, ArXiv.
[25] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[26] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[27] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[28] Anima Anandkumar,et al. Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .
[29] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[30] Lin F. Yang,et al. Near-Optimal Time and Sample Complexities for Solving Discounted Markov Decision Process with a Generative Model , 2018, 1806.01492.
[31] Ambuj Tewari,et al. Sequential complexities and uniform martingale laws of large numbers , 2015 .
[32] Jian Peng,et al. √n-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank , 2019, COLT.
[33] Ruosong Wang,et al. Optimism in Reinforcement Learning with Generalized Linear Function Approximation , 2019, ICLR.
[34] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[35] Ruosong Wang,et al. Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity , 2020, ArXiv.
[36] Cong Fang,et al. Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks , 2020, COLT.
[37] Michael I. Jordan,et al. Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations , 2020, NeurIPS.
[38] Colin Wei,et al. Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel , 2018, NeurIPS.
[39] Zhaoran Wang,et al. Neural Policy Gradient Methods: Global Optimality and Rates of Convergence , 2019, ICLR.
[40] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[41] Jason D. Lee,et al. Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks , 2019, ICLR.
[42] Yu Bai,et al. Towards Understanding Hierarchical Learning: Benefits of Neural Representations , 2020, NeurIPS.
[43] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[44] Dhruv Malik,et al. Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity , 2021, ICML.
[45] Michael I. Jordan,et al. Is Q-learning Provably Efficient? , 2018, NeurIPS.
[46] Tengyu Ma,et al. Label Noise SGD Provably Prefers Flat Global Minimizers , 2021, NeurIPS.
[47] Massimo Fornasier,et al. Robust and Resource-Efficient Identification of Two Hidden Layer Neural Networks , 2019, Constructive Approximation.
[48] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[49] Sham M. Kakade,et al. An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap , 2021, NeurIPS.
[50] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[51] Ambuj Tewari,et al. Online learning via sequential complexities , 2010, J. Mach. Learn. Res..
[52] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[53] Yuan Cao,et al. Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks , 2018, ArXiv.
[54] Nathan Srebro,et al. Eluder Dimension and Generalized Rank , 2021, ArXiv.
[55] Csaba Szepesv'ari,et al. Exponential Lower Bounds for Planning in MDPs With Linearly-Realizable Optimal Action-Value Functions , 2020, ALT.
[56] Nan Jiang,et al. Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches , 2018, COLT.
[57] Andrea Montanari,et al. Linearized two-layers neural networks in high dimension , 2019, The Annals of Statistics.
[58] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[59] Amnon Shashua,et al. Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.
[60] Quanquan Gu,et al. Neural Contextual Bandits with UCB-based Exploration , 2019, ICML.
[61] Gilad Yehudai,et al. On the Power and Limitations of Random Features for Understanding Neural Networks , 2019, NeurIPS.
[62] Percy Liang,et al. Tensor Factorization via Matrix Factorization , 2015, AISTATS.
[63] Nathan Srebro,et al. Kernel and Deep Regimes in Overparametrized Models , 2019, ArXiv.
[64] Wei Hu,et al. Provable Benefits of Representation Learning in Linear Bandits , 2020, ArXiv.
[65] Shachar Lovett,et al. Bilinear Classes: A Structural Framework for Provable Generalization in RL , 2021, ICML.
[66] Qi Cai,et al. Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy , 2019, ArXiv.
[67] Zhiqiang Xu,et al. Generalized phase retrieval : measurement number, matrix recovery and beyond , 2016, Applied and Computational Harmonic Analysis.
[68] Marie-Françoise Roy,et al. Real algebraic geometry , 1992 .
[69] Alessandro Lazaric,et al. Learning Near Optimal Policies with Low Inherent Bellman Error , 2020, ICML.
[70] Sham M. Kakade,et al. Few-Shot Learning via Learning the Representation, Provably , 2020, ICLR.
[71] Rémi Munos,et al. Minimax Regret Bounds for Reinforcement Learning , 2017, ICML.
[72] Benjamin Van Roy,et al. Eluder Dimension and the Sample Complexity of Optimistic Exploration , 2013, NIPS.
[73] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[74] Yuanzhi Li,et al. What Can ResNet Learn Efficiently, Going Beyond Kernels? , 2019, NeurIPS.
[75] Haipeng Luo,et al. Practical Contextual Bandits with Regression Oracles , 2018, ICML.
[76] Daniel Russo,et al. Worst-Case Regret Bounds for Exploration via Randomized Value Functions , 2019, NeurIPS.
[77] Ruosong Wang,et al. Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning? , 2020, ICLR.