Accuracy-based Curriculum Learning in Deep Reinforcement Learning

In this paper, we investigate a new form of automated curriculum learning based on adaptive selection of accuracy requirements, called accuracy-based curriculum learning. Using a reinforcement learning agent based on the Deep Deterministic Policy Gradient algorithm and addressing the Reacher environment, we first show that an agent trained with various accuracy requirements sampled randomly learns more efficiently than when asked to be very accurate at all times. Then we show that adaptive selection of accuracy requirements, based on a local measure of competence progress, automatically generates a curriculum where difficulty progressively increases, resulting in a better learning efficiency than sampling randomly.

[1]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[2]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[3]  Pierre-Yves Oudeyer,et al.  Socially guided intrinsic motivation for robot learning of motor skills , 2014, Auton. Robots.

[4]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[5]  Pierre-Yves Oudeyer,et al.  R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.

[6]  Pierre-Yves Oudeyer,et al.  Curiosity-Driven Development of Tool Use Precursors: a Computational Model , 2016, CogSci.

[7]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[8]  Peter Henderson,et al.  Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.

[9]  Pierre-Yves Oudeyer,et al.  Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration , 2018, ICLR.

[10]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[11]  Tom Schaul,et al.  Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.

[12]  Pierre-Yves Oudeyer,et al.  Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[13]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[14]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[15]  Pierre-Yves Oudeyer,et al.  Multi-Armed Bandits for Intelligent Tutoring Systems , 2013, EDM.

[16]  Pierre-Yves Oudeyer,et al.  Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..

[17]  Jürgen Schmidhuber,et al.  PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..

[18]  Pierre-Yves Oudeyer,et al.  Active choice of teachers, learning strategies and goals for a socially guided intrinsic motivation learner , 2012, Paladyn J. Behav. Robotics.

[19]  Tom Schaul,et al.  Universal Value Function Approximators , 2015, ICML.

[20]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[21]  Pieter Abbeel,et al.  Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.

[22]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[23]  Pierre-Yves Oudeyer,et al.  The strategic student approach for life-long exploration and learning , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[24]  Peter Stone,et al.  Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.

[25]  D. Weinshall,et al.  Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks , 2018, ICML.

[26]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[27]  Marlos C. Machado,et al.  A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.

[28]  Alex Graves,et al.  Automated Curriculum Learning for Neural Networks , 2017, ICML.

[29]  John Schulman,et al.  Teacher–Student Curriculum Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.