论文信息 - Risk Sensitive Reinforcement Learning Scheme Is Suitable for Learning on a Budget

Risk Sensitive Reinforcement Learning Scheme Is Suitable for Learning on a Budget

Risk-sensitive reinforcement learning (Risk-sensitiveRL) has been studied by many researchers. The methods are based on a prospect method, which imitates the value function of a human. Although they are mainly intended at imitating human behaviors, there are fewer discussions about the engineering meaning of it. In this paper, we show that Risk-sensitiveRL is useful for using online-learning machines whose resources are limited. In such a learning method, a part of the learned memories should be removed to create space for recording a new important instance. The experimental results show that risk-sensitive RL is superior to normal RL. This might mean that the human brain is also constructed by a limited number of neurons, so that humans hire the risk-sensitive value function for the learning.

Koichiro Yamauchi | Kazuyoshi Kato | K. Yamauchi | Kazuyoshi Kato

[1] Koichiro Yamauchi,et al. A Dynamic Pruning Strategy for Incremental Learning on a Budget , 2014, ICONIP.

[2] Sang Lyul Min,et al. LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies , 2001, IEEE Trans. Computers.

[3] Yuri Suzuki,et al. Co-learning system for humans and machines using a weighted majority-based method , 2016, Int. J. Hybrid Intell. Syst..

[4] Koichiro Yamauchi,et al. Incremental learning on a budget and its application to quick maximum power point tracking of photovoltaic systems , 2012, The 6th International Conference on Soft Computing and Intelligent Systems, and The 13th International Symposium on Advanced Intelligence Systems.

[5] Barbara Caputo,et al. The projectron: a bounded kernel-based Perceptron , 2008, ICML '08.

[6] Yoram Singer,et al. The Forgetron: A Kernel-Based Perceptron on a Budget , 2008, SIAM J. Comput..

[7] Francisco Herrera,et al. Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Si Wu,et al. A kernel-based Perceptron with dynamic memory , 2012, Neural Networks.

[9] A. Tversky,et al. Prospect theory: analysis of decision under risk , 1979 .

[10] Klaus Obermayer,et al. Risk-Sensitive Reinforcement Learning , 2013, Neural Computation.

[11] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[12] Koichiro Yamauchi,et al. Quick MPPT microconverter using a limited general regression neural network with adaptive forgetting , 2015, 2015 International Conference on Sustainable Energy Engineering and Application (ICSEEA).

[13] Koichiro Yamauchi,et al. Acceleration of reinforcement learning via game-based renewal energy management system , 2014, 2014 Joint 7th International Conference on Soft Computing and Intelligent Systems (SCIS) and 15th International Symposium on Advanced Intelligent Systems (ISIS).

[14] Kenji Fukumizu,et al. Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons , 2000, Neural Computation.

[15] Koichiro Yamauchi,et al. Pruning with replacement and automatic distance metric detection in limited general regression neural networks , 2011, The 2011 International Joint Conference on Neural Networks.

[16] Frank Schweitzer,et al. Risk-Seeking versus Risk-Avoiding Investments in Noisy Periodic Environments , 2008, ArXiv.

[17] A. Tversky,et al. Prospect Theory : An Analysis of Decision under Risk Author ( s ) : , 2007 .