Reward strategies for adaptive start-up scheduling of power plant

Power plant start-up scheduling is aimed at minimizing the start-up time while limiting maximum turbine-rotor stresses. A shorter start-up time reduces fuel and electricity consumption during the start-up process and increases its adaptability to changes in electricity demand. Online start-up scheduling increases the flexibility of power plant operation. The start-up scheduling problem can be formulated as a combinatorial optimization problem with constraints. This problem has a number of local optima with a wide and high-dimension search space. The optimal schedule lies somewhere near the boundary of the feasible space. To achieve an efficient and robust search model, we propose the use of an enforcement operator to focus the search along the boundary and other local search strategies such as the reuse function and tabu search used in combination with genetic algorithms (GAs). We also propose integrating GAs with reinforcement learning. During the search process, GAs would guide the learning toward the promising areas. Reinforcement learning can generate a good schedule in the earlier stage of the search process. After learning representative optimal schedules, the search performance virtually satisfies the goal of this research: to search for optimal or near-optimal schedules in 30 seconds. For industrial use, the design of a reward strategy is crucial. We show that (a) positive rewards succeed with both low and high-dimension reinforcement-learning output, and (b) negative rewards succeed only with low-dimension output. We present our proposed model with analysis and test results.

[1]  S. Bednarski,et al.  ANALYSIS AND ALGORITHM FOR A MINIMAX PROBLEM WITH THERMAL STRESS APPLICATIONS , 1973 .

[2]  H. Matsumoto,et al.  Turbine Control System Based on Prediction of Rotor Thermal Stress , 1982, IEEE Transactions on Power Apparatus and Systems.

[3]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[5]  David E. Goldberg,et al.  Sizing Populations for Serial and Parallel Genetic Algorithms , 1989, ICGA.

[6]  D. E. Goldberg,et al.  Genetic Algorithms in Search, Optimization & Machine Learning , 1989 .

[7]  Ming Tan,et al.  Cost-Sensitive Reinforcement Learning for Adaptive Classification and Control , 1991, AAAI.

[8]  Fred W. Glover,et al.  A user's guide to tabu search , 1993, Ann. Oper. Res..

[9]  Colin R. Reeves,et al.  Using Genetic Algorithms with Small Populations , 1993, ICGA.

[10]  Seiitsu Nigawara,et al.  An operation support expert system based on on-line dynamics simulation and fuzzy reasoning for startup schedule optimization in fossil power plants , 1993 .

[11]  Long Ji Lin,et al.  Scaling Up Reinforcement Learning for Robot Control , 1993, International Conference on Machine Learning.

[12]  Shigenobu Kobayashi,et al.  Reinforcement Learning by Stochastic Hill Climbing on Discounted Reward , 1995, ICML.

[13]  I. Ono,et al.  Thermal power plant start-up scheduling with evolutionary computation by using an enforcement operator , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[14]  I. Ono,et al.  A Genetic Algorithm with Characteristic Preservation for Function Optimization , 1996 .

[15]  Isao Ono,et al.  Adaptive Search based Thermal Power Plant Start-up Scheduling , 1997 .

[16]  Shigenobu Kobayashi,et al.  Power plant start-up scheduling: a reinforcement learning approach combined with evolutionary computation , 1998, J. Intell. Fuzzy Syst..