Deep reinforcement learning for automated stock trading: an ensemble strategy

Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market situations. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand technique for processing very large data. We test our algorithms on the 30 Dow Jones stocks that have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.

[1]  K. Šrédl,et al.  Commodity Channel Index: Evaluation of Trading Rule of Agricultural Commodities , 2016 .

[2]  Jason Hsu,et al.  Expected Returns: An Investor’s Guide to Harvesting Market Rewards , 2014 .

[3]  Stelios D. Bekiros,et al.  Heterogeneous trading strategies with adaptive fuzzy Actor-Critic reinforcement learning: A behavioral approach , 2010 .

[4]  Kyong Joo Oh,et al.  An intelligent hybrid trading system for discovering trading rules for the futures market using rough sets and genetic algorithms , 2017, Appl. Soft Comput..

[5]  Lin Chen,et al.  Application of Deep Reinforcement Learning on Automated Stock Trading , 2019, 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS).

[6]  Mark Kritzman,et al.  Skulls, Financial Turbulence, and Risk Management , 2010 .

[7]  Steven Skiena,et al.  Trading Strategies to Exploit Blog and News Sentiment , 2010, ICWSM.

[8]  Ralph Neuneier,et al.  Optimal Asset Allocation using Adaptive Dynamic Programming , 1995, NIPS.

[9]  Youyong Kong,et al.  Deep Direct Reinforcement Learning for Financial Signal Representation and Trading , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Liuqing Yang,et al.  DP-LSTM: Differential Privacy-inspired LSTM for Stock Prediction Using Financial News , 2019, ArXiv.

[11]  Marco Corazza,et al.  Testing different Reinforcement Learning con?gurations for ?nancial trading: Introduction and applications , 2018 .

[12]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[13]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[14]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[15]  Yanran Li,et al.  Adversarial Deep Reinforcement Learning in Portfolio Management , 2018 .

[16]  Xiao-Yang Liu,et al.  A Practical Machine Learning Approach for Dynamic Stock Recommendation , 2018, 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE).

[17]  John N. Tsitsiklis,et al.  Actor-Critic Algorithms , 1999, NIPS.

[18]  Zihao Zhang,et al.  Deep Reinforcement Learning for Trading , 2019, The Journal of Financial Data Science.

[19]  Yong Zhang,et al.  Online Portfolio Selection Strategy Based on Combining Experts’ Advice , 2016, Computational Economics.

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Nir Levine,et al.  An empirical investigation of the challenges of real-world reinforcement learning , 2020, ArXiv.

[22]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[23]  Stelios D. Bekiros,et al.  Fuzzy adaptive decision-making for boundedly rational traders in speculative stock markets , 2010, Eur. J. Oper. Res..

[24]  Xiao-Yang Liu,et al.  Practical Machine Learning Approach to Capture the Scholar Data Driven Alpha in AI Industry , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[25]  Lucian Busoniu,et al.  Reinforcement learning for control: Performance, stability, and deep approximators , 2018, Annu. Rev. Control..

[26]  Terence Tai Leung Chong,et al.  Revisiting the Performance of MACD and RSI Oscillators , 2014 .

[27]  Matthew Saffell,et al.  Learning to trade via direct reinforcement , 2001, IEEE Trans. Neural Networks.

[28]  Quang-Vinh Dang,et al.  Reinforcement Learning in Stock Trading , 2019, ICCSAMA.

[29]  Yuandong Tian,et al.  Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.

[30]  Ha Young Kim,et al.  Improving financial trading decisions using deep Q-learning: Predicting the number of shares, action strategies, and transfer learning , 2019, Expert Syst. Appl..

[31]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[32]  Ikhlaas Gurrib “ Performance of the Average Directional Index as a market timing tool for the most actively traded USD based currency pairs ” , 2019 .

[33]  Lu Wang,et al.  Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation , 2018, KDD.

[34]  Ralph Neuneier,et al.  Enhancing Q-Learning for Optimal Asset Allocation , 1997, NIPS.

[35]  W. Sharpe The Sharpe Ratio , 1994 .

[36]  Wenhang Bao,et al.  Multi-Agent Deep Reinforcement Learning for Liquidation Strategy Analysis , 2019, ArXiv.

[37]  Ruonan Rao,et al.  Learning to Trade with Deep Actor Critic Methods , 2018, 2018 11th International Symposium on Computational Intelligence and Design (ISCID).

[38]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[39]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[40]  Xiao-Yang Liu,et al.  Practical Deep Reinforcement Learning Approach for Stock Trading , 2018, ArXiv.

[41]  Thomas G. Fischer,et al.  Reinforcement learning in financial markets - a survey , 2018 .

[42]  Zhengyao Jiang,et al.  Cryptocurrency portfolio management with deep reinforcement learning , 2016, 2017 Intelligent Systems Conference (IntelliSys).