Accelerating and improving deep reinforcement learning-based active flow control: transfer training of policy network

Deep reinforcement learning (DRL) has gradually emerged as an effective and novel method to achieve active flow control with outstanding performance. This paper focuses on exploring the strategy of improving learning efficiency and control performance of a new task using existing control experience. More specifically, the proximal policy optimization (PPO) algorithm is used to control the flow past a circular cylinder using jets. The DRL controllers trained from the initialized parameter are able to obtain drag reductions of 8%, 18.7%, 18.4%, 25.2%, at Re=100, 200, 300 and 1000, respectively, and it takes more episodes to converge for the cases with higher Reynolds number, due to the increased flow complexity. Furthermore, the agent trained at high Reynolds number shows satisfied control performance when it is applied to the lower Reynolds number cases, which proves a strong correlation between the control policy and the flow patterns between the flows under different conditions. To better utilize the experience of the control policy of the trained agent, the flow control tasks with Re=200, 300 and 1000 are retrained, based on the trained agent at Re=100, 200 and 300, respectively. Our results show that a dramatic enhancement of the learning efficiency can be achieved, that is the number of the training episodes reduces to be less than 20% of the agents trained with random initialization. The great performance of the transfer training method of the DRL agent shows its potential on economizing the training cost and improving control effectiveness, especially for complex control tasks.