暂无分享,去创建一个
Tuo Zhao | Yizhou Wang | Xingguo Li | Zhuoran Yang | Tianyi Liu | Minshuo Chen | Zhaoran Wang | T. Zhao | Xingguo Li | Yizhou Wang | Zhuoran Yang | Zhaoran Wang | Tianyi Liu | Minshuo Chen
[1] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[2] Francis R. Bach,et al. On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..
[3] Arthur Gretton,et al. On gradient regularizers for MMD GANs , 2018, NeurIPS.
[4] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[5] Bin Yu. RATES OF CONVERGENCE FOR EMPIRICAL PROCESSES OF STATIONARY MIXING SEQUENCES , 1994 .
[6] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[7] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[9] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[10] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.
[11] Yingyu Liang,et al. Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.
[12] Mykel J. Kochenderfer,et al. Imitating driver behavior with generative adversarial networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).
[13] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[14] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[15] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.
[16] A. Müller. Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.
[17] M. Willem. Minimax Theorems , 1997 .
[18] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[19] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[20] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .
[21] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[22] Luca Bascetta,et al. Policy gradient in Lipschitz Markov Decision Processes , 2015, Machine Learning.
[23] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[24] Mingrui Liu,et al. Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.
[25] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[26] Le Song,et al. Learning Temporal Point Processes via Reinforcement Learning , 2018, NeurIPS.
[27] Arthur Gretton,et al. Demystifying MMD GANs , 2018, ICLR.
[28] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[29] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[30] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[31] S. S. Vallender. Calculation of the Wasserstein Distance Between Probability Distributions on the Line , 1974 .
[32] Kee-Eung Kim,et al. Imitation Learning via Kernel Mean Embedding , 2018, AAAI.
[33] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[34] Yongxin Chen,et al. On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator , 2019, ArXiv.
[35] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[36] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[37] Elizabeth L. Wilmer,et al. Markov Chains and Mixing Times , 2008 .
[38] Wolfram Burgard,et al. Socially Compliant Navigation Through Raw Depth Inputs with Generative Adversarial Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[39] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[40] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[41] Le Song,et al. Smoothed Dual Embedding Control , 2017, ArXiv.
[42] Yiming Yang,et al. MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.
[43] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[44] K Fan,et al. Minimax Theorems. , 1953, Proceedings of the National Academy of Sciences of the United States of America.
[45] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[46] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[47] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[48] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[49] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[50] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML '08.
[51] Yunmei Chen,et al. Optimal Primal-Dual Methods for a Class of Saddle Point Problems , 2013, SIAM J. Optim..
[52] Le Song,et al. SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation , 2017, ICML.
[53] Vivek S. Borkar,et al. Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..
[54] Sergey Levine,et al. Continuous Inverse Optimal Control with Locally Optimal Examples , 2012, ICML.
[55] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[56] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[57] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[58] Mehryar Mohri,et al. Rademacher Complexity Bounds for Non-I.I.D. Processes , 2008, NIPS.
[59] Arkadi Nemirovski,et al. Robust Convex Optimization , 1998, Math. Oper. Res..
[60] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[61] W. Murray,et al. A Projected Lagrangian Algorithm for Nonlinear Minimax Optimization , 1980 .
[62] Andreas Krause,et al. Reinforced Imitation: Sample Efficient Deep Reinforcement Learning for Mapless Navigation by Leveraging Prior Demonstrations , 2018, IEEE Robotics and Automation Letters.
[63] Antonin Chambolle,et al. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.