论文信息 - Learning the Solution Manifold in Optimization and Its Application in Motion Planning

Learning the Solution Manifold in Optimization and Its Application in Motion Planning

Optimization is an essential component for solving problems in wide-ranging fields. Ideally, the objective function should be designed such that the solution is unique and the optimization problem can be solved stably. However, the objective function used in a practical application is usually non-convex, and sometimes it even has an infinite set of solutions. To address this issue, we propose to learn the solution manifold in optimization. We train a model conditioned on the latent variable such that the model represents an infinite set of solutions. In our framework, we reduce this problem to density estimation by using importance sampling, and the latent representation of the solutions is learned by maximizing the variational lower bound. We apply the proposed algorithm to motion-planning problems, which involve the optimization of high-dimensional parameters. The experimental results indicate that the solution manifold can be learned with the proposed algorithm, and the trained model represents an infinite set of homotopic solutions for motion-planning problems.

Takayuki Osa | Takayuki Osa

[1] Didier Wolf,et al. Capture of homotopy classes with probabilistic road map , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2] Siddhartha S. Srinivasa,et al. CHOMP: Covariant Hamiltonian optimization for motion planning , 2013, Int. J. Robotics Res..

[3] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[4] Dumitru Dumitrescu,et al. Multimodal Optimization by Means of a Topological Species Conservation Algorithm , 2010, IEEE Transactions on Evolutionary Computation.

[5] Mohit Sharma,et al. Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information , 2018, ICLR.

[6] Oussama Khatib,et al. Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1985, Autonomous Robot Vehicles.

[7] Byron Boots,et al. Continuous-time Gaussian process motion planning via probabilistic inference , 2017, Int. J. Robotics Res..

[8] Marc Toussaint,et al. Motion Planning Explorer: Visualizing Local Minima Using a Local-Minima Tree , 2020, IEEE Robotics and Automation Letters.

[9] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.

[10] Dirk P. Kroese,et al. The Cross-Entropy Method for Continuous Multi-Extremal Optimization , 2006 .

[11] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.

[12] Thierry Siméon,et al. Path Deformation Roadmaps: Compact Graphs with Useful Cycles for Motion Planning , 2008, Int. J. Robotics Res..

[13] B. Faverjon,et al. Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[14] Kalyanmoy Deb,et al. Finding multiple solutions for multimodal optimization problems using a multi-objective evolutionary approach , 2010, GECCO '10.

[15] Emilio Frazzoli,et al. Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[16] Yee Whye Teh,et al. Neural probabilistic motor primitives for humanoid control , 2018, ICLR.

[17] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[18] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.

[19] Takayuki Osa,et al. Multimodal trajectory optimization for motion planning , 2020, Int. J. Robotics Res..

[20] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[21] Nikolaus Hansen,et al. Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[22] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[23] Lydia E. Kavraki,et al. Analysis of probabilistic roadmaps for path planning , 1998, IEEE Trans. Robotics Autom..

[24] Michiel van de Panne,et al. Diverse Motions and Character Shapes for Simulated Skills , 2014, IEEE Transactions on Visualization and Computer Graphics.

[25] Masashi Sugiyama,et al. Hierarchical Reinforcement Learning via Advantage-Weighted Information Maximization , 2019, ICLR.

[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[27] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[28] Lydia E. Kavraki,et al. Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..

[29] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[30] Pieter Abbeel,et al. Motion planning with sequential convex optimization and convex collision checking , 2014, Int. J. Robotics Res..

[31] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.

[32] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[33] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.

[34] G. Oriolo,et al. Robotics: Modelling, Planning and Control , 2008 .

[35] R. Ho. Algebraic Topology , 2022 .

[36] Emilien Dupont,et al. Joint-VAE: Learning Disentangled Joint Continuous and Discrete Representations , 2018, NeurIPS.

[37] Surya P. N. Singh,et al. V-REP: A versatile and scalable robot simulation framework , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[38] Shie Mannor,et al. A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[39] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.

[40] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[41] Steven M. LaValle,et al. Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[42] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[43] Stefan Schaal,et al. STOMP: Stochastic trajectory optimization for motion planning , 2011, 2011 IEEE International Conference on Robotics and Automation.

[44] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.

[45] David E. Goldberg,et al. Genetic Algorithms with Sharing for Multimodalfunction Optimization , 1987, ICGA.