论文信息 - Q-learning System Based on Cooperative Least Squares Support Vector Machine

Q-learning System Based on Cooperative Least Squares Support Vector Machine

2 Abstract In order to solve the problem of slow convergence speed in reinforcement learning systems, a Q learning system based on a cooperative least squares support vector machine for continuous state space and discrete action space is pro- posed. The proposed Q learning system is composed of a least squares support vector regression machine (LS-SVRM) and a least squares support vector classiflcation machine (LS-SVCM). The LS-SVRM is used to approximate a mapping from a state- action pair to a value function, and the LS-SVCM is used to ap- proximate a mapping from a continuous state space to a discrete action space. In addition, the LS-SVCM supplies the LS-SVRM with dynamic and real-time knowledge or advice (suggested ac- tion) to accelerate its learning process. Simulation studies in- volving a mountain car control illustrate that compared with a Q learning system based on a single LS-SVRM, the proposed Q learning system has a faster convergence speed and a better learning performance.

[1] Johan A. K. Suykens,et al. Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[2] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[3] Chen Shi,et al. Research on Reinforcement Learning Technology: A Review , 2004 .

[4] Xi Li-feng. Pattern driven scheduling system based on reinforcement learning , 2007 .

[5] Jude W. Shavlik,et al. Giving Advice about Preferred Actions to Reinforcement Learners Via Knowledge-Based Kernel Regression , 2005, AAAI.

[6] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[7] Richard Alan Peters,et al. Reinforcement Learning with a Supervisor for a Mobile Robot in a Real-world Environment , 2007, 2007 International Symposium on Computational Intelligence in Robotics and Automation.

[8] Cao Wei. A New Q Learning Algorithm for Multi-agent Systems , 2007 .

[9] Hiroshi Matsuo,et al. State generalization method with support vector machines in reinforcement learning , 2006, Systems and Computers in Japan.

[10] Jude W. Shavlik,et al. A Simple and Effective Method for Incorporating Advice into Kernel Methods , 2006, AAAI.

[11] Fernando Tadeo,et al. Model-free learning control of neutralization processes using reinforcement learning , 2007, Eng. Appl. Artif. Intell..

[12] Wang Dong-li. Elevator Group Control Using Reinforcement Learning with CMAC , 2007 .

[13] Kyriakos Mouratidis,et al. Continuous Nearest Neighbor Queries over Sliding Windows , 2007, IEEE Transactions on Knowledge and Data Engineering.

[14] Kyriakos Mouratidis,et al. Continuous Nearest Neighbor Queries over Sliding Windows , 2007 .

[15] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16] Jude W. Shavlik,et al. Knowledge-Based Kernel Approximation , 2004, J. Mach. Learn. Res..

[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[18] Xuesong Wang,et al. A fuzzy Actor-Critic reinforcement learning network , 2007, Inf. Sci..

[19] Xuesong Wang,et al. Value Approximation with Least Squares Support Vector Machine in Reinforcement Learning System , 2007 .