暂无分享,去创建一个
Jan Peters | Hany Abdulsamad | Carlo D'Eramo | Joni Pajarinen | Boris Belousov | Pascal Klink | J. Peters | B. Belousov | Hany Abdulsamad | J. Pajarinen | Carlo D'Eramo | Pascal Klink
[1] Jan Peters,et al. High Acceleration Reinforcement Learning for Real-World Juggling with Binary Rewards , 2020, CoRL.
[2] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[3] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[4] Nan Jiang,et al. Markov Decision Processes with Continuous Side Information , 2017, ALT.
[5] Farhan Abrol,et al. Variational Tempering , 2016, AISTATS.
[6] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[7] Liang Zheng,et al. Unsupervised Person Re-identification: Clustering and Fine-tuning , 2017 .
[8] Shiguang Shan,et al. Self-Paced Curriculum Learning , 2015, AAAI.
[9] John L. Nazareth,et al. Introduction to derivative-free optimization , 2010, Math. Comput..
[10] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[11] Pierre-Yves Oudeyer,et al. Accuracy-based Curriculum Learning in Deep Reinforcement Learning , 2018, ArXiv.
[12] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[13] Alessandro Lazaric,et al. Transfer in Reinforcement Learning: A Framework and a Survey , 2012, Reinforcement Learning.
[14] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[15] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[16] Pierre-Yves Oudeyer,et al. Intrinsically motivated goal exploration for active motor learning in robots: A case study , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[17] Shiguang Shan,et al. Self-Paced Learning with Diversity , 2014, NIPS.
[18] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[19] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[20] Daphne Koller,et al. Self-Paced Learning for Latent Variable Models , 2010, NIPS.
[21] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[22] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[23] Andreas Krause,et al. Safe controller optimization for quadrotors with Gaussian processes , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Daoyi Dong,et al. Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[26] Marc Toussaint,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.
[27] Sergey Levine,et al. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.
[28] Peter Stone,et al. Learning Curriculum Policies for Reinforcement Learning , 2018, AAMAS.
[29] Amos Storkey,et al. Continuously Tempered Hamiltonian Monte Carlo , 2017, UAI.
[30] Jan Peters,et al. Probabilistic Movement Primitives , 2013, NIPS.
[31] Michael A. Osborne,et al. Probabilistic numerics and uncertainty in computations , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[32] Marlos C. Machado,et al. Count-Based Exploration with the Successor Representation , 2018, AAAI.
[33] B. Skinner,et al. The Behavior of Organisms: An Experimental Analysis , 2016 .
[34] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[35] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.
[36] Chang Liu,et al. Understanding and Accelerating Particle-Based Variational Inference , 2018, ICML.
[37] Jianhong Wang,et al. Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning , 2017, NeurIPS.
[38] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[39] Anne Auger,et al. Comparing results of 31 algorithms from the black-box optimization benchmarking BBOB-2009 , 2010, GECCO '10.
[40] Deva Ramanan,et al. Self-Paced Learning for Long-Term Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[41] Herke van Hoof,et al. A Performance-Based Start State Curriculum Framework for Reinforcement Learning , 2020, AAMAS.
[42] Naonori Ueda,et al. Deterministic Annealing Variant of the EM Algorithm , 1994, NIPS.
[43] Gerhard Neumann,et al. Variational Inference for Policy Search in changing situations , 2011, ICML.
[44] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[45] Phil Husbands,et al. Once More Unto the Breach: Co-evolving a robot and its simulator , 2004 .
[46] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[47] Jan Peters,et al. Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.
[48] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[49] F. Aluffi-Pentini,et al. The Use of "Continuous Method" in Complementarity Problems , 1985 .
[50] Andre Wibisono,et al. Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem , 2018, COLT.
[51] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[52] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[53] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[54] G. Parisi,et al. Simulated tempering: a new Monte Carlo scheme , 1992, hep-lat/9205018.
[55] Jan Peters,et al. Reinforcement learning vs human programming in tetherball robot games , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[56] Matthew Fellows,et al. VIREL: A Variational Inference Framework for Reinforcement Learning , 2018, NeurIPS.
[57] Jan Peters,et al. Receding Horizon Curiosity , 2019, CoRL.
[58] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[59] Simon J. D. Prince,et al. Computer Vision: Models, Learning, and Inference , 2012 .
[60] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[61] Pierre-Yves Oudeyer,et al. Teacher algorithms for curriculum learning of Deep RL in continuously parameterized environments , 2019, CoRL.
[62] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[63] Tong Zhang,et al. Multi-stage Convex Relaxation for Learning with Sparse Regularization , 2008, NIPS.
[64] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[65] Matthew E. Taylor,et al. Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey , 2020, J. Mach. Learn. Res..
[66] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[67] Deepak Kumar,et al. BRINGING UP ROBOT: FUNDAMENTAL MECHANISMS FOR CREATING A SELF-MOTIVATED, SELF-ORGANIZING ARCHITECTURE , 2005, Cybern. Syst..
[68] S. Schaal. Dynamic Movement Primitives -A Framework for Motor Control in Humans and Humanoid Robotics , 2006 .
[69] Cun-Hui Zhang. Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.
[70] Peter Rossmanith,et al. Simulated Annealing , 2008, Taschenbuch der Algorithmen.
[71] Deyu Meng,et al. A theoretical understanding of self-paced learning , 2017, Inf. Sci..
[72] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[73] Jan Peters,et al. Non-parametric Policy Search with Limited Information Loss , 2017, J. Mach. Learn. Res..
[74] Deyu Meng,et al. Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search , 2014, ACM Multimedia.
[75] Kenneth O. Stanley,et al. POET: open-ended coevolution of environments and their optimized solutions , 2019, GECCO.
[76] Minoru Asada,et al. Purposive Behavior Acquisition for a Real Robot by Vision-Based Reinforcement Learning , 2005, Machine Learning.
[77] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[78] Xiao-Li Meng,et al. Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling , 1998 .
[79] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[80] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[81] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[82] W.D. Smart,et al. What does shaping mean for computational reinforcement learning? , 2008, 2008 7th IEEE International Conference on Development and Learning.