Monte Carlo Techniques in Planning

The Monte Carlo Tree Search (MCTS) algorithm has recently proved to be able to solve difficult problems in the field of optimization as well as game-playing. It has been able to address several problems that no conventional techniques have been able to solve efficiently. In this thesis we investigate possible ways to use MCTS in the field of planning and scheduling. We analyze the problem theoretically trying to identify possible difficulties when using MCTS in this field. We propose the solutions to these problems based on a modification of the algorithm and preprocessing the planning domain. We present the techniques we have developed for these tasks and we combine them into an applicable algorithm. We specialize the method for a specific kind of planning problems the transportation problems. We compare our planner with other planning system.

[1]  V. S. Subrahmanian,et al.  Complexity, Decidability and Undecidability Results for Domain-Independent Planning , 1995, Artif. Intell..

[2]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[3]  Tristan Cazenave,et al.  Nested Monte-Carlo Search , 2009, IJCAI.

[4]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[5]  Maja Mili Planning in Action Formalisms based on DLs: First Results ? , 2007 .

[6]  Blai Bonet,et al.  A Complete Algorithm for Generating Landmarks , 2011, ICAPS.

[7]  David Silver,et al.  Combining online and offline knowledge in UCT , 2007, ICML '07.

[8]  Levente Kocsis,et al.  Transpositions and move groups in Monte Carlo tree search , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[9]  Ivan Serina,et al.  Planning in PDDL2.2 Domains with LPG-td , 2004 .

[10]  Hector Muñoz-Avila,et al.  Learning Methods to Generate Good Plans: Integrating HTN Learning and Reinforcement Learning , 2010, AAAI.

[11]  Michèle Sebag,et al.  The grand challenge of computer Go , 2012, Commun. ACM.

[12]  Thomas J. Walsh,et al.  Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.

[13]  Yngvi Björnsson,et al.  CadiaPlayer: A Simulation-Based General Game Player , 2009, IEEE Transactions on Computational Intelligence and AI in Games.

[14]  Zhiqing Liu,et al.  Backpropagation Modification in Monte-Carlo Game Tree Search , 2009, 2009 Third International Symposium on Intelligent Information Technology Application.

[15]  Martin Müller,et al.  A Local Monte Carlo Tree Search Approach in Deterministic Planning , 2011, AAAI.

[16]  Subbarao Kambhampati,et al.  Learning Probabilistic Hierarchical Task Networks to Capture User Preferences , 2009, IJCAI.

[17]  Stefan Edelkamp,et al.  Automated Planning: Theory and Practice , 2007, Künstliche Intell..

[18]  Jan Faigl,et al.  Iterative Prototype Optimisation with Evolved Improvement Steps , 2006, EuroGP.

[19]  Peter Drake The Last-Good-Reply Policy for Monte-Carlo Go , 2009, J. Int. Comput. Games Assoc..

[20]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[21]  Roman Barták,et al.  Three Approaches to Solve the Petrobras Challenge: Exploiting Planning Techniques for Solving Real-Life Logistics Problems , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.

[22]  H. Bandelt,et al.  Metric graph theory and geometry: a survey , 2006 .

[23]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[24]  Damien Pellier,et al.  An UCT Approach for Anytime Agent-Based Planning , 2010, PAAMS.

[25]  H. Jaap van den Herik,et al.  Single-Player Monte-Carlo Tree Search , 2008, Computers and Games.

[26]  Maja Milicic Brandt Planning in Action Formalisms based on DLs: First Results , 2007, Description Logics.

[27]  Blai Bonet,et al.  Strengthening Landmark Heuristics via Hitting Sets , 2010, ECAI.

[28]  Dana S. Nau,et al.  Complexity results for hierarchical task-network planning , 1996 .

[29]  Erik D. Demaine,et al.  Games, puzzles and computation , 2009 .

[30]  Bart Selman,et al.  On Adversarial Search Spaces and Sampling-Based Planning , 2010, ICAPS.

[31]  Jonathan Schaeffer,et al.  The History Heuristic and Alpha-Beta Search Enhancements in Practice , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Patrik Haslum,et al.  Admissible Heuristics for Optimal Planning , 2000, AIPS.

[33]  Peter Ford Dominey,et al.  When shared plans go wrong: From atomic- to composite actions and back , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34]  Gerhard Goos,et al.  Automated Technology for Verification and Analysis , 2004, Lecture Notes in Computer Science.

[35]  Carmel Domshlak,et al.  Landmarks, Critical Paths and Abstractions: What's the Difference Anyway? , 2009, ICAPS.

[36]  Hector Muñoz-Avila,et al.  HTN-MAKER: Learning HTNs with Minimal Additional Knowledge Engineering Required , 2008, AAAI.

[37]  Ivan Serina,et al.  LPG: A Planner Based on Local Search for Planning Graphs with Action Costs , 2002, AIPS.

[38]  Dirk P. Kroese,et al.  Handbook of Monte Carlo Methods , 2011 .

[39]  Martin Hanes Petrobras Planning Domain: PDDL Modeling and Solving , 2012 .