A policy-improving system with a mixture probability and clustering distributions to unknown 3d-environments

There are many proposed policy-improving system of Reinforcement Learning (RL) agents that effective in quickly adapting to environmental change by using many statistical methods, such as using a Mixture Model of Bayesian network, using Mixture Probability and Clustering Distribution, etc. However, by using a mixture model of Bayesian network, this system increase the computational complexity that make the control of the computational complexity becomes a necessary problem. On the other hand, by using mixture probability and clustering distribution, even though the computational complexity can be controlled and simultaneously maintain the system's performance, the examination of computational complexity load and the adaptation performance to more complex environments such as 3D-environments are required. In this paper, we concentrate on the policy-improving system by using mixture probability and clustering distributions. We introduce new parameters and the modified reward process for experiments on 3D-environments, and then investigate and discuss the performance of our proposed system from the results.

[1]  Ryohei Nakano,et al.  Empirical analysis of an on-line adaptive system using a mixture of Bayesian networks , 2010, Inf. Sci..

[2]  Minoru Asada,et al.  Environmental change adaptation for mobile robot navigation , 1998, Proceedings. 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems. Innovations in Theory, Practice and Applications (Cat. No.98CH36190).

[3]  Kuo-Chu Chang,et al.  Weighing and Integrating Evidence for Stochastic Simulation in Bayesian Networks , 2013, UAI.

[4]  Manuela M. Veloso,et al.  Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[5]  E. Hellinger,et al.  Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. , 1909 .

[6]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[7]  Adnan Darwiche,et al.  Inference in belief networks: A procedural guide , 1996, Int. J. Approx. Reason..

[8]  Uthai Phommasak,et al.  An Adaptation System in Unknown Environments Using a Mixture Probability Model and Clustering Distributions , 2012, J. Adv. Comput. Intell. Intell. Informatics.

[9]  M. Yamamura,et al.  An approach to Lifelong Reinforcement Learning through Multiple Environments , 1998 .

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  C. .,et al.  Model-Assisted Approaches for Relational Reinforcement Learning : Some challenges for the SRL community , 2006 .