Multi-objective optimization by reinforcement learning for power system dispatch and voltage stability

This paper presents a new method called Multi-objective Optimization by Reinforcement Learning (MORL), to solve the optimal power system dispatch and voltage stability problem. In MORL, the search is undertaken on individual dimension in a high-dimensional space via a path selected by an estimated path value which represents the potential of finding a better solution. MORL is compared with multi-objective evolutionary algorithm based on decomposition (MOEA/D) to solve the multi-objective optimal power flow problems in power systems. The simulation results have demonstrated that MORL is superior over MOEA/D, as MORL can find wider and more evenly distributed Pareto fronts, obtain more accurate Pareto optimal solutions, and require less computation time.

[1]  B. Zhao,et al.  A multiagent-based particle swarm optimization approach for optimal reactive power dispatch , 2005, IEEE Transactions on Power Systems.

[2]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.

[3]  R. Bellman A Markovian Decision Process , 1957 .

[4]  David A. Van Veldhuizen,et al.  Evolutionary Computation and Convergence to a Pareto Front , 1998 .

[5]  D. Barraclough,et al.  Prefrontal cortex and decision making in a mixed-strategy game , 2004, Nature Neuroscience.

[6]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[7]  Eduardo F. Morales,et al.  A New Distributed Reinforcement Learning Algorithm for Multiple Objective Optimization Problems , 2000, IBERAMIA-SBIA.

[8]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[9]  Aharon Ben-Tal,et al.  Characterization of Pareto and Lexicographic Optimal Solutions , 1980 .

[10]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[11]  David W. Coit,et al.  Multi-objective optimization using genetic algorithms: A tutorial , 2006, Reliab. Eng. Syst. Saf..

[12]  Peter J. Fleming,et al.  Multiobjective genetic algorithms made easy: selection sharing and mating restriction , 1995 .

[13]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[14]  John R. Anderson,et al.  From recurrent choice to skill learning: a reinforcement-learning model. , 2006, Journal of experimental psychology. General.

[15]  Y. Wang,et al.  Seeking the Pareto front for multiobjective spatial optimization problems , 2008, Int. J. Geogr. Inf. Sci..

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  H. Glavitsch,et al.  Estimating the Voltage Stability of a Power System , 1986, IEEE Transactions on Power Delivery.

[18]  Q. H. Wu,et al.  Function Optimization by Reinforcement Learning for power system dispatch and voltage stability , 2010, IEEE PES General Meeting.

[19]  Doya Kenji,et al.  Multiobjective Reinforcement Learning based on Multiple Value Function , 2006 .

[20]  John S. Bridle,et al.  Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters , 1989, NIPS.

[21]  Carlos A. Coello Coello,et al.  Towards a More Efficient Multi-Objective Particle Swarm Optimizer , 2008 .