Utilising Uncertainty for Efficient Learning of Likely-Admissible Heuristics

Likely-admissible heuristics have previously been introduced as heuristics that are admissible with some probability. While such heuristics only produce likely-optimal plans, they have the advantage that it is more feasible to learn such heuristics from training data using machine learning algorithms. Naturally, it is ideal if this training data consists of optimal plans, but such data is often prohibitive to produce. To overcome this, previous work introduced a bootstrap procedure that generates training data using random task generation that incrementally learns on more complex tasks. However, 1) using random task generation is inefficient and; 2) the procedure generates non-optimal plans for training and this causes errors to compound as learning progresses, resulting in high suboptimality. In this paper we introduce a framework that utilises uncertainty to overcome the shortcomings of previous approaches. In particular, we show that we can use uncertainty to efficiently explore task-space when generating training tasks, and then learn likely-admissible heuristics that produce low suboptimality. We illustrate the advantages of our approach on the 15-puzzle, 24-puzzle, 24-pancake and 15-blocksworld domains using Bayesian neural networks to model uncertainty.

[1]  Marcus Liwicki,et al.  A Comprehensive guide to Bayesian Convolutional Neural Network with Variational Inference , 2019, ArXiv.

[2]  Pierre Baldi,et al.  Solving the Rubik's Cube Without Human Knowledge , 2018, ArXiv.

[3]  Pieter Abbeel,et al.  Learning Generalized Reactive Policies using Deep Neural Networks , 2017, ICAPS.

[4]  Lawrence Carin,et al.  Learning Structured Weight Uncertainty in Bayesian Neural Networks , 2017, AISTATS.

[5]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[6]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[7]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[8]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[9]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[10]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[12]  Sandra Zilles,et al.  Learning heuristic functions for large state spaces , 2011, Artif. Intell..

[13]  Wheeler Ruml,et al.  Learning Inadmissible Heuristics During Search , 2011, ICAPS.

[14]  Malte Helmert,et al.  Landmark Heuristics for the Pancake Problem , 2010, SOCS.

[15]  Sandra Zilles,et al.  Bootstrap Learning of Heuristic Functions , 2010, SOCS.

[16]  M. Zeldin Heuristics! , 2010 .

[17]  Jonathan Schaeffer,et al.  Learning from Multiple Heuristics , 2008, AAAI.

[18]  Fan Yang,et al.  A General Theory of Additive State Space Abstractions , 2008, J. Artif. Intell. Res..

[19]  Jonathan Schaeffer,et al.  Duality in permutation state spaces and the dual search algorithm , 2008, Artif. Intell..

[20]  Ariel Felner,et al.  Solving the 24 Puzzle with Instance Dependent Pattern Databases , 2005, SARA.

[21]  Marco Gori,et al.  Likely-Admissible and Sub-Symbolic Heuristics , 2004, ECAI.

[22]  Richard E. Korf,et al.  Disjoint pattern database heuristics , 2002, Artif. Intell..

[23]  Jonathan Schaeffer,et al.  Sokoban: Enhancing general single-agent search methods using domain knowledge , 2001, Artif. Intell..

[24]  Richard E. Korf,et al.  Finding Optimal Solutions to the Twenty-Four Puzzle , 1996, AAAI/IAAI, Vol. 2.

[25]  Richard E. Korf,et al.  Depth-First Iterative-Deepening: An Optimal Admissible Tree Search , 1985, Artif. Intell..

[26]  Ian Osband,et al.  Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout , 2016 .

[27]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[28]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[29]  Patrik Haslum,et al.  Domain Knowledge in Planning : Representation and Use , 2003 .

[30]  John K. Slaney,et al.  Blocks World revisited , 2001, Artif. Intell..