A new approximate dynamic programming algorithm based on an actor–critic framework for optimal control of alkali–surfactant–polymer flooding

ABSTRACT An approximate dynamic programming algorithm based on an actor–critic framework is proposed in this article. This algorithm takes the actor–critic framework as the basic framework, in which the actor and the critic are used to approximate the optimal value function and the control strategy, respectively. At first, the linear basis function approximator is used to approximate the value function. Then the method of basis function construction based on system characteristics is introduced. Furthermore, since the injection concentration of ASP flooding has a fixed interval, the action weighting method is adopted to restrict and approximate the optimal control action. The value function parameter and the two strategy parameters are updated by the gradient descent method. Meanwhile the eligibility trace is introduced to accelerate convergence. Finally, ASP flooding with four injection wells and nine production wells is used to test the effect of the proposed method.

[1]  Xin Zhang,et al.  Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method , 2011, IEEE Transactions on Neural Networks.

[2]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[3]  Shurong Li,et al.  An approximate dynamic programming method for the optimal control of Alkai-Surfactant-Polymer flooding , 2018 .

[4]  Robert H. Storer,et al.  An approximate dynamic programming approach for the vehicle routing problem with stochastic demands , 2009, Eur. J. Oper. Res..

[5]  Qiang Zhang,et al.  Optimal control of polymer flooding based on mixed-integer iterative dynamic programming , 2011, Int. J. Control.

[6]  Michael Margaliot,et al.  A Maximum Principle for Single-Input Boolean Control Networks , 2011, IEEE Transactions on Automatic Control.

[7]  Richard F. Hartl,et al.  Dynamic programming based metaheuristics for the dial-a-ride problem , 2016, Ann. Oper. Res..

[8]  Yang Lei,et al.  Optimization of ASP flooding based on dynamic scale IDP with mixed-integer , 2017 .

[10]  Frank L. Lewis,et al.  Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2010, Autom..

[11]  Maurice Queyranne,et al.  Production and Inventory Model Using Net Present Value , 2002, Oper. Res..

[12]  Akhil Datta-Gupta,et al.  Optimal Waterflood Management Using Rate Control , 2007 .

[13]  Zheng Wen,et al.  Use of Approximate Dynamic Programming for Production Optimization , 2011, ANSS 2011.

[14]  Sedigheh Mahdavi,et al.  A novel approach for modeling and optimization of surfactant/polymer flooding based on Genetic Programming evolutionary algorithm , 2016 .

[15]  Hamidreza M. Nick,et al.  The impact of reduction of doublet well spacing on the Net Present Value and the life time of fluvial Hot Sedimentary Aquifer doublets , 2017 .

[16]  Frank L. Lewis,et al.  A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..

[17]  Shurong Li,et al.  A Novel Interacting Multiple-Model Method and Its Application to Moisture Content Prediction of ASP Flooding , 2018 .

[18]  Derong Liu,et al.  Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Yang Lei,et al.  Optimal control of polymer flooding for enhanced oil recovery , 2013, Int. J. Model. Identif. Control..

[20]  Eduardo Gildin,et al.  Tensor based geology preserving reservoir parameterization with Higher Order Singular Value Decomposition (HOSVD) , 2016, Comput. Geosci..