An adaptive optimization scheme with satisfactory transient performance

Adaptive optimization (AO) schemes based on stochastic approximation principles such as the Random Directions Kiefer-Wolfowitz (RDKW), the Simultaneous Perturbation Stochastic Approximation (SPSA) and the Adaptive Fine-Tuning (AFT) algorithms possess the serious disadvantage of not guaranteeing satisfactory transient behavior due to their requirement for using random or random-like perturbations of the parameter vector. The use of random or random-like perturbations may lead to particularly large values of the objective function, which may result to severe poor performance or stability problems when these methods are applied to closed-loop controller optimization applications. In this paper, we introduce and analyze a new algorithm for alleviating this problem. Mathematical analysis establishes satisfactory transient performance and convergence of the proposed scheme under a general set of assumptions. Application of the proposed scheme to the adaptive optimization of a large-scale, complex control system demonstrates the efficiency of the proposed scheme.

[1]  Sean P. Meyn,et al.  The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning , 2000, SIAM J. Control. Optim..

[2]  John N. Tsitsiklis,et al.  Gradient Convergence in Gradient methods with Errors , 1999, SIAM J. Optim..

[3]  Ron Meir,et al.  Approximation bounds for smooth functions in C(Rd) by neural and mixture networks , 1998, IEEE Trans. Neural Networks.

[4]  Markos Papageorgiou,et al.  Adaptive Fine-Tuning of Nonlinear Control Systems With Application to the Urban Traffic Control Strategy TUC , 2007, IEEE Transactions on Control Systems Technology.

[5]  J. Spall Adaptive stochastic approximation by the simultaneous perturbation method , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[6]  M. Papageorgiou,et al.  Adaptive fine-tuning of non-linear control systems with application to the urban traffic control strategy TUC , 2007, 2007 European Control Conference (ECC).

[7]  Michael C. Fu,et al.  Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences , 2003, TOMC.

[8]  P. Kokotovic,et al.  Inverse Optimality in Robust Stabilization , 1996 .

[9]  H. Robbins,et al.  A CONVERGENCE THEOREM FOR NON NEGATIVE ALMOST SUPERMARTINGALES AND SOME APPLICATIONS**Research supported by NIH Grant 5-R01-GM-16895-03 and ONR Grant N00014-67-A-0108-0018. , 1971 .

[10]  J. Dippon,et al.  Weighted Means in Stochastic Approximation of Minima , 1997 .

[11]  Harold J. Kushner,et al.  wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[12]  Markos Papageorgiou,et al.  Extensions and New Applications of the Traffic-Responsive Urban Control Strategy: Coordinated Signal Control for Urban Networks , 2003 .

[13]  J. Spall Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .

[14]  Guang-Bin Huang,et al.  Convex incremental extreme learning machine , 2007, Neurocomputing.

[15]  Marios M. Polycarpou,et al.  High-order neural network structures for identification of dynamical systems , 1995, IEEE Trans. Neural Networks.

[16]  Markos Papageorgiou,et al.  EXTENSIONS AND NEW APPLICATIONS OF THE TRAFFIC SIGNAL CONTROL STRATEGY TUC , 2003 .

[17]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[18]  T. M. Williams,et al.  Optimizing Methods in Statistics , 1981 .