A Unified Framework for Planning with Learned Neural Network Transition Models

Automated planning with neural network transition models is a two stage approach to solving planning problems with unknown transition models. The first stage of the approach learns the unknown transition model from data as a neural network model, and the second stage of the approach compiles the learned model to either a Mixed-Integer Linear Programming (MILP) model or a Recurrent Neural Network (RNN) model, and optimize it using an off-the-shelf solver. The previous studies have shown that both models have their advantages and disadvantages. Namely, the MILP model can be solved optimally using a branch-and-bound algorithm but has been experimentally shown not to scale well for neural networks with multiple hidden layers. In contrast, the RNN model can be solved effectively using a gradient descent algorithm but can only work under very restrictive assumptions. In this paper, we focus on improving the effectiveness of solving the second stage of the approach by introducing (i) a novel Lagrangian RNN architecture that can model the previously ignored components of the planning problem as Lagrangian functions, and (ii) a novel framework that unifies the MILP and the Lagrangian RNN models such that the weakness of one model is complemented by the strength of the other. Experimentally, we show that our unifying framework significantly outperforms the standalone MILP model by solving 80% more problem instances, and showcase the ability of our unifying framework to find high quality solutions to challenging automated planning problems with unknown transition models.

[1]  Bernhard Nebel,et al.  The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[2]  Paolo Traverso,et al.  Automated Planning: Theory & Practice , 2004 .

[3]  Shie Mannor,et al.  Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations , 2014, ICML.

[4]  Scott Sanner,et al.  Reward Potentials for Planning with Learned Neural Network Transition Models , 2019, CP.

[5]  Scott Sanner,et al.  Metric Hybrid Factored Planning in Nonlinear Domains with Constraint Generation , 2019, CPAIOR.

[6]  Scott Sherwood Benson,et al.  Learning action models for reactive autonomous agents , 1996 .

[7]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[8]  Scott Sanner,et al.  Hindsight Optimization for Hybrid State and Action MDPs , 2017, AAAI.

[9]  Herbert A. Simon,et al.  Rule Creation and Rule Learning Through Environmental Exploration , 1989, IJCAI.

[10]  Scott Sanner,et al.  Scalable Planning with Tensorflow for Hybrid Nonlinear Domains , 2017, NIPS.

[11]  Bart Selman,et al.  Planning as Satisfiability , 1992, ECAI.

[12]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[13]  Scott W. Bennett,et al.  Real-world robotics: Learning to plan for robust execution , 1996, Machine Learning.

[14]  Yolanda Gil,et al.  Acquiring domain knowledge for planning by experimentation , 1992 .

[15]  Russ Tedrake,et al.  Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[16]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[17]  Felipe Meneguzzi,et al.  Online Probabilistic Goal Recognition over Nominal Models , 2019, IJCAI.

[18]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[19]  Buser Say Optimal Planning with Learned Neural Network Transition Models , 2020 .

[20]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[21]  Scott Sanner,et al.  Scalable Planning with Deep Neural Network Learned Transition Models , 2020, J. Artif. Intell. Res..

[22]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[23]  R. Findeisen,et al.  Nonlinear Model Predictive Path-Following Control , 2009 .

[24]  Scott Sanner,et al.  Compact and efficient encodings for planning in factored state and action spaces with learned Binarized Neural Network transition models , 2020, Artif. Intell..

[25]  Ambros M. Gleixner,et al.  SCIP: global optimization of mixed-integer nonlinear programs in a branch-and-cut framework , 2018, Optim. Methods Softw..

[26]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Malte Helmert,et al.  The Fast Downward Planning System , 2006, J. Artif. Intell. Res..

[28]  Scott Sanner,et al.  Nonlinear Hybrid Planning with Deep Net Learned Transition Models and Mixed-Integer Linear Programming , 2017, IJCAI.

[29]  Peter J. Stuckey,et al.  Sequencing Operator Counts , 2015, ICAPS.

[30]  Rüdiger Ehlers,et al.  Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.

[31]  William W.-G. Yeh,et al.  Reservoir Management and Operations Models: A State‐of‐the‐Art Review , 1985 .

[32]  Peter J. Stuckey,et al.  Theoretical and Experimental Results for Planning with Learned Binarized Neural Network Transition Models , 2020, CP.

[33]  Sven Gowal,et al.  Scalable Verified Training for Provably Robust Image Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Blai Bonet,et al.  LP-Based Heuristics for Cost-Optimal Planning , 2014, ICAPS.

[35]  Christian Tjandraatmadja,et al.  Strong mixed-integer programming formulations for trained neural networks , 2018, Mathematical Programming.

[36]  Scott Sanner,et al.  Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models , 2018, IJCAI.