Explainable deep reinforcement learning for portfolio management: an empirical approach

Deep reinforcement learning (DRL) has been widely studied in the portfolio management task. However, it is challenging to understand a DRL-based trading strategy because of the black-box nature of deep neural networks. In this paper, we propose an empirical approach to explain the strategies of DRL agents for the portfolio management task. First, we use a linear model in hindsight as the reference model, which finds the best portfolio weights by assuming knowing actual stock returns in foresight. In particular, we use the coefficients of a linear model in hindsight as the reference feature weights. Secondly, for DRL agents, we use integrated gradients to define the feature weights, which are the coefficients between reward and features under a linear regression model. Thirdly, we study the prediction power in two cases, single-step prediction and multistep prediction. In particular, we quantify the prediction power by calculating the linear correlations between the feature weights of a DRL agent and the reference feature weights, and similarly for machine learning methods. Finally, we evaluate a portfolio management task on Dow Jones 30 constituent stocks during 01/01/2009 to 09/01/2021. Our approach empirically reveals that a DRL agent exhibits a stronger multi-step prediction power thanmachine learning methods. CCS CONCEPTS •Computingmethodologies→Machine learning;Neural networks; Markov decision processes; Reinforcement learning; Policy iteration; Value iteration.

[1]  Anwar Elwalid,et al.  FinRL-podracer: high performance and scalable deep reinforcement learning for quantitative finance , 2021, ICAIF.

[2]  Prudhvi Gurram,et al.  Sanity Checks for Saliency Metrics , 2019, AAAI.

[3]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[4]  Qian Chen,et al.  FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance , 2020, ArXiv.

[5]  Cuntai Guan,et al.  A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Dacheng Xiu,et al.  Taming the Factor Zoo∗ , 2017 .

[7]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[8]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[9]  Stephen P. Boyd,et al.  Multi-Period Trading via Convex Optimization , 2017, Found. Trends Optim..

[10]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[11]  Hongyang Yang,et al.  FinRL: deep reinforcement learning framework to automate trading in quantitative finance , 2021, ICAIF.

[12]  Abhishek Das,et al.  Grad-CAM: Why did you say that? , 2016, ArXiv.

[13]  Kaleigh Clary,et al.  Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep Reinforcement Learning , 2020, ICLR.

[14]  Tim Miller,et al.  Explainable Reinforcement Learning Through a Causal Lens , 2019, AAAI.

[15]  E. Fama,et al.  The Capital Asset Pricing Model: Theory and Evidence , 2003 .

[16]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[17]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[18]  Xiao-Yang Liu,et al.  FinRL-Meta: A Universe of Near-Real Market Environments for Data-Driven Deep Reinforcement Learning in Quantitative Finance , 2021, ArXiv.

[19]  Michael I. Jordan,et al.  ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning , 2021, ArXiv.

[20]  Ryan Brown,et al.  Portfolio Performance Attribution: A Machine Learning‐Based Approach , 2020 .

[21]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[22]  Natalia Díaz Rodríguez,et al.  Explainability in Deep Reinforcement Learning , 2020, Knowl. Based Syst..

[23]  Jochen Papenbrock,et al.  Understanding Machine Learning for Diversified Portfolio Construction by Explainable AI , 2020 .

[24]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[25]  L. Cong,et al.  AlphaPortfolio: Direct Construction Through Deep Reinforcement Learning and Interpretable AI , 2020 .