Derivation of integrated state equation for combined outputs-inputs vector of discrete-time linear time-invariant system and its application to reinforcement learning
暂无分享,去创建一个
K. Doya | E. Uchibe | H. Anai | Tomotake Sasaki | H. Yanami | Hidenao Iwane
[1] Thomas Kailath,et al. Linear Systems , 1980 .
[2] Kang-Zhi Liu,et al. A New Optimal Digital Output Feedback Control and Its Application to the Control of Mechanical Systems , 1986 .
[3] Shigeyasu Kawaji. On the Digital Control using Dead-beat Observers , 1988 .
[4] Andrew G. Barto,et al. Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.
[5] Steven J. Bradtke,et al. Incremental dynamic programming for on-line adaptive optimal control , 1995 .
[6] J. Doyle,et al. Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.
[7] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[8] Jun Morimoto,et al. Reinforcement Learning State Estimator , 2007, Neural Computation.
[9] Frank L. Lewis,et al. Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[10] Kunihisa Okano,et al. Stabilization of uncertain systems with finite data rates and Markovian packet losses , 2013, 2013 European Control Conference (ECC).
[11] Hirokazu Anai,et al. Policy gradient reinforcement learning method for discrete-time linear quadratic regulation problem using estimated state value function , 2017, 2017 56th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE).