论文信息 - A parameter control method in reinforcement learning to rapidly follow unexpected environmental changes.

A parameter control method in reinforcement learning to rapidly follow unexpected environmental changes.

In order to rapidly follow unexpected environmental changes, we propose a parameter control method in reinforcement learning that changes each of learning parameters in appropriate directions. We determine each appropriate direction on the basis of relationships between behaviors and neuromodulators by considering an emergency as a key word. Computer experiments show that the agents using our proposed method could rapidly respond to unexpected environmental changes, not depending on either two reinforcement learning algorithms (Q-learning and actor-critic (AC) architecture) or two learning problems (discontinuous and continuous state-action problems).

Kazushi Murakoshi | Junya Mizuno

[1] Kenji Doya,et al. Metalearning and neuromodulation , 2002, Neural Networks.

[2] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3] James S. Albus,et al. I A New Approach to Manipulator Control: The I Cerebellar Model Articulation Controller , 1975 .

[4] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[5] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[6] Junichiro Yoshimoto,et al. Control of exploitation-exploration meta-parameter in reinforcement learning , 2002, Neural Networks.

[7] James S. Albus,et al. New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[8] Kenji Doya,et al. Meta-learning in Reinforcement Learning , 2003, Neural Networks.

[9] S. Sara,et al. Response to Novelty and its Rapid Habituation in Locus Coeruleus Neurons of the Freely Exploring Rat , 1995, The European journal of neuroscience.

[10] J. Leander,et al. Selective Serotonin Reuptake Inhibitors Decrease Impulsive Behavior as Measured by an Adjusting Delay Procedure in the Pigeon , 2002, Neuropsychopharmacology.

[11] F. Fadda,et al. Hippocampal acetylcholine release correlates with spatial learning performance in freely moving rats , 2000, Neuroreport.