MAPS: Multi-Agent reinforcement learning-based Portfolio management System

Generating an investment strategy using advanced deep learning methods in stock markets has recently been a topic of interest. Most existing deep learning methods focus on proposing an optimal model or network architecture by maximizing return. However, these models often fail to consider and adapt to the continuously changing market conditions. In this paper, we propose the Multi-Agent reinforcement learning-based Portfolio management System (MAPS). MAPS is a cooperative system in which each agent is an independent "investor" creating its own portfolio. In the training procedure, each agent is guided to act as diversely as possible while maximizing its own return with a carefully designed loss function. As a result, MAPS as a system ends up with a diversified portfolio. Experiment results with 12 years of US market data show that MAPS outperforms most of the baselines in terms of Sharpe ratio. Furthermore, our results show that adding more agents to our system would allow us to get a higher Sharpe ratio by lowering risk with a more diversified portfolio.

[1]  Youyong Kong,et al.  Deep Direct Reinforcement Learning for Financial Signal Representation and Trading , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Shay B. Cohen,et al.  Stock Movement Prediction from Tweets and Historical Prices , 2018, ACL.

[3]  Maosong Sun,et al.  Enhancing Stock Movement Prediction with Adversarial Training , 2018, IJCAI.

[4]  Jaewoo Kang,et al.  HATS: A Hierarchical Graph Attention Network for Stock Movement Prediction , 2019, ArXiv.

[5]  V. S. Glukhov,et al.  Idiosyncrasies and challenges of data driven learning in electronic trading , 2018, 1811.09549.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Brian J. Bushee,et al.  Fundamental Analysis Future Earnings, and Stock Prices , 1997 .

[8]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[9]  J. Poterba,et al.  Mean Reversion in Stock Prices: Evidence and Implications , 1987 .

[10]  Xiao-Yang Liu,et al.  Practical Deep Reinforcement Learning Approach for Stock Trading , 2018, ArXiv.

[11]  Carina Silberer,et al.  Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2013 .

[12]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[13]  Narasimhan Jegadeesh,et al.  Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency , 1993 .

[14]  Gary Geunbae Lee,et al.  Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2012, ACL 2012.

[15]  Luca Di Persio,et al.  Artificial Neural Networks architectures for stock price prediction: comparisons and applications , 2016 .

[16]  Andrew W. Lo,et al.  Foundations of Technical Analysis: Computational Algorithms, Statistical Inference, and Empirical Implementation , 2000 .

[17]  Garrison W. Cottrell,et al.  A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction , 2017, IJCAI.

[18]  Yue Zhang,et al.  Deep Learning for Event-Driven Stock Prediction , 2015, IJCAI.

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.