论文信息 - Empirical Study of the Sensitivity of CACLA to Sub-optimal Parameter Setting in Learning Feedback Controllers

Empirical Study of the Sensitivity of CACLA to Sub-optimal Parameter Setting in Learning Feedback Controllers

Continuous Action-Critic Learning Automaton (CACLA) offers an interesting alternative to traditional control approaches to feedback control problems. In this paper, we report results obtained on an inertial model of a feed drive with potentially sub-optimal parameter setting and designer decisions. Namely, we have tested different reward signals, different number of features to approximate value functions and policies, and different learning gains. The results show CACLA to be a very highly robust approach.

Manuel Graña | Borja Fernández-Gauna | Igor Ansoategui | Ismael Etxeberria

[1] F. Lewis,et al. Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[2] Frank L. Lewis,et al. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , 2012 .

[3] Hado van Hasselt,et al. Reinforcement Learning in Continuous State and Action Spaces , 2012, Reinforcement Learning.

[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5] Yoram Koren,et al. Advanced Controllers for Feed Drives , 1992 .

[6] Manuel Graña,et al. An Empirical Study of Actor-Critic Methods for Feedback Controllers of Ball-Screw Drivers , 2013, IWINAC.

[7] Andrew G. Barto,et al. Reinforcement learning , 1998 .

[8] F.L. Lewis,et al. Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[9] Tsu-Chin Tsao,et al. Machine Tool Feed Drives and Their Control—A Survey of the State of the Art , 1997 .

[10] Martin A. Riedmiller,et al. Reinforcement learning in feedback control , 2011, Machine Learning.

[11] Evan Dekker,et al. Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.

[12] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .