Lebesgue-Sampling-Based Optimal Control Problems With Time Aggregation

We formulate the Lebesgue-sampling-based optimal control problem. We show that the problem can be solved by the time aggregation approach in Markov decision processes (MDP) theory. Policy-iteration-based and reinforcement-learning-based methods are developed for the optimal policies. Both analytical solutions and sample-path-based algorithms are given. Compared to the periodic-sampling scheme, the Lebesgue sampling scheme improves system performance.

[1]  A. Wald The Fitting of Straight Lines if Both Variables are Subject to Error , 1940 .

[2]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[3]  P. Varaiya,et al.  Multilayer control of large Markov chains , 1978 .

[4]  C. A. Desoer,et al.  Nonlinear Systems Analysis , 1978 .

[5]  S. Karlin,et al.  A second course in stochastic processes , 1981 .

[6]  H. Sira-Ramírez A geometric approach to pulse-width modulated control in nonlinear dynamical systems , 1989 .

[7]  Karl Johan Åström,et al.  A neuron-based pulse servo for motion control , 1990, Proceedings., IEEE International Conference on Robotics and Automation.

[8]  George A. Perdikaris Computer Controlled Systems , 1991 .

[9]  V. Borkar,et al.  A unified framework for hybrid control , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[10]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[11]  O. De Smet,et al.  Flow control in a failure-prone multi-machine manufacturing system , 1995, Proceedings 1995 INRIA/IEEE Symposium on Emerging Technologies and Factory Automation. ETFA'95.

[12]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[13]  V. Borkar,et al.  A unified framework for hybrid control: model and optimal control theory , 1998, IEEE Trans. Autom. Control..

[14]  Karl-Erik Årzén,et al.  A simple event-based PID controller , 1999 .

[15]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[16]  Vivek S. Borkar,et al.  Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..

[17]  Fredrik Gustafsson,et al.  Event based sampling with application to vibration analysis in pneumatic tires , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[18]  K. Åström,et al.  Comparison of Riemann and Lebesgue sampling for first order stochastic systems , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[19]  Zhiyuan Ren,et al.  A time aggregation approach to Markov decision processes , 2002, Autom..

[20]  M. Miskowicz The event-triggered sampling optimization criterion for distributed networked monitoring and control systems , 2003, IEEE International Conference on Industrial Technology, 2003.

[21]  M. Miskowicz,et al.  Application-driven flow control in distributed monitoring and control systems , 2003, IEEE International Conference on Industrial Technology, 2003.

[22]  R. McCann,et al.  Improved operation of networked control systems using Lebesgue sampling , 2004, Conference Record of the 2004 IEEE Industry Applications Conference, 2004. 39th IAS Annual Meeting..

[23]  J. Baras,et al.  Sampling of diffusion processes for real-time estimation , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  Claudio De Persis,et al.  n-bit stabilization of n-dimensional nonlinear systems in feedforward form , 2004, IEEE Transactions on Automatic Control.

[26]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[27]  Xi-Ren Cao,et al.  Stochastic learning and optimization - A sensitivity-based approach , 2007, Annu. Rev. Control..

[28]  Xi Chen,et al.  Policy iteration based feedback control , 2008, Autom..

[29]  U. Rieder,et al.  Markov Decision Processes , 2010 .