Stabilized Nested Rollout Policy Adaptation

Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to modify NRPA in order to improve the stability of the algorithm. Experiments show it improves the algorithm for different application domains: SameGame, Traveling Salesman with Time Windows and Expression Discovery.

[1]  Stefan Edelkamp,et al.  Solving Physical Traveling Salesman Problems with policy adaptation , 2014, 2014 IEEE Conference on Computational Intelligence and Games.

[2]  W. Langdon An Analysis of the MAX Problem in Genetic Programming , 1997 .

[3]  Tristan Cazenave Nested Rollout Policy Adaptation with Selective Policies , 2016, CGW@IJCAI.

[4]  F. Portela,et al.  An unexpectedly effective Monte Carlo technique for the RNA inverse folding problem , 2018, bioRxiv.

[5]  Stefan Edelkamp,et al.  Monte-Carlo Tree Search for 3D Packing with Object Orientation , 2014, KI.

[6]  Stefan Edelkamp,et al.  Algorithm and knowledge engineering for the TSPTW problem , 2013, 2013 IEEE Symposium on Computational Intelligence in Scheduling (CISched).

[7]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[8]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[9]  Bruno Bouzy Monte-Carlo Fork Search for Cooperative Path-Finding , 2013, CGW@IJCAI.

[10]  Tristan Cazenave,et al.  Nested Monte-Carlo Expression Discovery , 2010, ECAI.

[11]  Fabien Teytaud,et al.  Optimization of the Nested Monte-Carlo Algorithm on the Traveling Salesman Problem with Time Windows , 2011, EvoApplications.

[12]  Benjamin Négrevergne,et al.  Distributed Nested Rollout Policy for SameGame , 2017, CGW@IJCAI.

[13]  Michael Thielscher,et al.  Nested Monte Carlo Search for Two-Player Games , 2016, AAAI.

[14]  Stefan Edelkamp,et al.  Monte-Carlo Tree Search for the Multiple Sequence Alignment Problem , 2015, SOCS.

[15]  Nicolas Jouandeau,et al.  Parallel Nested Monte-Carlo search , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[16]  Samy Bengio,et al.  The Vehicle Routing Problem with Time Windows Part II: Genetic Search , 1996, INFORMS J. Comput..

[17]  Tristan Cazenave,et al.  Monte-Carlo Expression Discovery , 2013, Int. J. Artif. Intell. Tools.

[18]  Otthein Herzog,et al.  Monte-Carlo Tree Search for Logistics , 2016 .

[19]  Tristan Cazenave,et al.  Forecasting Financial Volatility Using Nested Monte Carlo Expression Discovery , 2015, 2015 IEEE Symposium Series on Computational Intelligence.

[20]  Christopher D. Rosin,et al.  Nested Rollout Policy Adaptation for Monte Carlo Tree Search , 2011, IJCAI.

[21]  Tristan Cazenave,et al.  Nested Monte-Carlo Search , 2009, IJCAI.

[22]  Andrzej Nagórko Parallel Nested Rollout Policy Adaptation , 2019, 2019 IEEE Conference on Games (CoG).

[23]  Robert Feldt,et al.  Heuristic Model Checking using a Monte-Carlo Tree Search Algorithm , 2015, GECCO.

[24]  Bruno Bouzy Burnt Pancake Problem: New Lower Bounds on the Diameter and New Experimental Optimality Ratios , 2016, SOCS.

[25]  Jean Méhat,et al.  Combining UCT and Nested Monte Carlo Search for Single-Player General Game Playing , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[26]  Robert Feldt,et al.  Generating structured test data with specific properties using nested Monte-Carlo search , 2014, GECCO.

[27]  Fabien Teytaud,et al.  Application of the Nested Rollout Policy Adaptation Algorithm to the Traveling Salesman Problem with Time Windows , 2012, LION.