论文信息 - Evolving indoor navigational strategies using gated recurrent units in NEAT

Evolving indoor navigational strategies using gated recurrent units in NEAT

Simultaneous Localisation and Mapping (SLAM) algorithms are expensive to run on smaller robotic platforms such as Micro-Aerial Vehicles. Bug algorithms are an alternative that use relatively little processing power, and avoid high memory consumption by not building an explicit map of the environment. In this work we explore the performance of Neuroevolution - specifically NEAT - at evolving control policies for simulated differential drive robots carrying out generalised maze navigation. We compare this performance with respect to one particular bug algorithm known as I-Bug. We extend NEAT to include Gated Recurrent Units (GRUs) to help deal with long term dependencies. We show that both NEAT and our NEAT-GRU can repeatably generate controllers that outperform I-Bug on a test set of 209 indoor maze like environments. We show that NEAT-GRU is superior to NEAT in this task. Moreover, we show that out of the 2 systems, only NEAT-GRU can continuously evolve successful controllers for a much harder task in which no bearing information about the target is provided to the agent.

[1] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.

[2] Andreas Krause,et al. Reinforced Imitation: Sample Efficient Deep Reinforcement Learning for Mapless Navigation by Leveraging Prior Demonstrations , 2018, IEEE Robotics and Automation Letters.

[3] Nicholas Roy,et al. RANGE–Robust autonomous navigation in GPS‐denied environments , 2011, J. Field Robotics.

[4] Vladimir J. Lumelsky,et al. A paradigm for incorporating vision in the robot navigation function , 1988, Proceedings. 1988 IEEE International Conference on Robotics and Automation.

[5] V. Lumelsky,et al. Dynamic path planning for a mobile automaton with limited information on the environment , 1986 .

[6] Charles E. Hughes,et al. Evolving plastic neural networks with novelty search , 2010, Adapt. Behav..

[7] Kenneth O. Stanley,et al. A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.

[8] Alfred A. Rizzi,et al. Autonomous navigation for BigDog , 2010, 2010 IEEE International Conference on Robotics and Automation.

[9] Kenneth O. Stanley,et al. Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.

[10] Stéphane Doncieux,et al. Emergence of memory in neuroevolution: impact of selection pressures , 2012, GECCO '12.

[11] Guido C. H. E. de Croon,et al. A Comparative Study of Bug Algorithms for Robot Navigation , 2018, Robotics Auton. Syst..

[12] Karl Tuyls,et al. Distance-Based Multi-Robot Coordination on Pocket Drones , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[13] Steven M. LaValle,et al. I-Bug: An intensity-based bug algorithm , 2009, 2009 IEEE International Conference on Robotics and Automation.

[14] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[15] Kenneth O. Stanley,et al. Exploiting Open-Endedness to Solve Problems Through the Search for Novelty , 2008, ALIFE.

[16] Jürgen Schmidhuber,et al. Co-evolving recurrent neurons learn deep memory POMDPs , 2005, GECCO '05.

[17] Eliseo Ferrante,et al. ARGoS: a modular, parallel, multi-engine simulator for multi-robot systems , 2012, Swarm Intelligence.

[18] Sebastien Glaser,et al. Simultaneous Localization and Mapping: A Survey of Current Trends in Autonomous Driving , 2017, IEEE Trans. Intell. Veh..

[19] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.

[20] Kenneth O. Stanley,et al. Minimal criterion coevolution: a new approach to open-ended search , 2017, GECCO.

[21] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.

[22] Kenneth O. Stanley,et al. Novelty Search and the Problem with Objectives , 2011 .

[23] Vijay Kumar,et al. 3 D Indoor Exploration with a Computationally Constrained MAV , 2011 .

[24] Gong-You Tang,et al. Vectorization path planning for autonomous mobile agent in unknown environment , 2012, Neural Computing and Applications.

[25] Sebastian Risi,et al. Evolving Neural Turing Machines for Reward-based Learning , 2016, GECCO.

[26] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[27] Ehud Rivlin,et al. Sensory-based motion planning with global proofs , 1997, IEEE Trans. Robotics Autom..

[28] Eliseo Ferrante,et al. Swarmanoid: A Novel Concept for the Study of Heterogeneous Robotic Swarms , 2013, IEEE Robotics & Automation Magazine.

[29] S. Risi,et al. Continual Learning through Evolvable Neural Turing Machines , 2016 .

[30] David Peter Shorten,et al. Evolving Generalised Maze Solvers , 2015, EvoApplications.

[31] M. Vidyasagar,et al. Path planning for moving a point object amidst unknown obstacles in a plane: a new algorithm and a general theory for algorithm development , 1990, 29th IEEE Conference on Decision and Control.

[32] Vladimir J. Lumelsky,et al. Incorporating range sensing in the robot navigation function , 1990, IEEE Trans. Syst. Man Cybern..

[33] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.

[34] Kenneth O. Stanley,et al. Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.

[35] Ehud Rivlin,et al. CautiousBug: a competitive algorithm for sensory-based robot navigation , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[36] Dario Floreano,et al. Exploring the T-Maze: Evolving Learning-Like Robot Behaviors Using CTRNNs , 2003, EvoWorkshops.

[37] Oussama Khatib,et al. Springer Handbook of Robotics , 2007, Springer Handbooks.

[38] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[39] Sebastian Risi,et al. Indirectly Encoding Neural Plasticity as a Pattern of Local Rules , 2010, SAB.

[40] Dario Floreano,et al. Evolutionary Advantages of Neuromodulated Plasticity in Dynamic, Reward-based Scenarios , 2008, ALIFE.

[41] Risto Miikkulainen,et al. Evolving Deep LSTM-based Memory Networks using an Information Maximization Objective , 2016, GECCO.

[42] Rémi Munos,et al. Observe and Look Further: Achieving Consistent Performance on Atari , 2018, ArXiv.

[43] Jen Jen Chung,et al. Evolving memory-augmented neural architecture for deep memory problems , 2017, GECCO.

[44] Risto Miikkulainen,et al. Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[45] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[46] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.

[47] Sebastian Risi,et al. Continual and One-Shot Learning Through Neural Networks with Dynamic External Memory , 2017, EvoApplications.

[48] Sebastian Thrun,et al. Scan Alignment and 3-D Surface Modeling with a Helicopter Platform , 2003, FSR.

[49] Mathukumalli Vidyasagar,et al. A new path planning algorithm for moving a point object amidst unknown obstacles in a plane , 1990, Proceedings., IEEE International Conference on Robotics and Automation.

[50] Wojciech Jaskowski,et al. ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).

[51] Sebastian Risi,et al. A unified approach to evolving plasticity and neural geometry , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[52] Kenneth O. Stanley,et al. Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.

[53] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[54] Ming Liu,et al. Towards Cognitive Exploration through Deep Reinforcement Learning for Mobile Robots , 2016, ArXiv.

[55] Kenneth O. Stanley,et al. Revising the evolutionary computation abstraction: minimal criteria novelty search , 2010, GECCO '10.

[56] Charles E. Hughes,et al. How novelty search escapes the deceptive trap of learning to learn , 2009, GECCO.