Deep Reinforcement Learning for Cryptocurrency Trading: Practical Approach to Address Backtest Overfitting

Designing profitable and reliable trading strategies is challenging in the highly volatile cryptocurrency market. Existing works applied deep reinforcement learning methods and optimistically reported increased profits in backtesting, which may suffer from the false positive issue due to overfitting. In this paper, we propose a practical approach to address backtest overfitting for cryptocurrency trading using deep reinforcement learning. First, we formulate the detection of backtest overfitting as a hypothesis test. Then, we train the DRL agents, estimate the probability of overfitting, and reject the overfitted agents, increasing the chance of good trading performance. Finally, on 10 cryptocurrencies over a testing period from 05/01/2022 to 06/27/2022 (during which the crypto market crashed two times), we show that the less overfitted deep reinforcement learning agents have a higher return than that of more overfitted agents, an equal weight strategy, and the S&P DBM Index (market benchmark), offering confidence in possible deployment to a real market.

[1]  Andrew Ang,et al.  Practical Applications of Asset Allocation with Crypto: Application of Preferences for Positive Skewness , 2023, Practical Applications.

[2]  Xiao-Yang Liu,et al.  FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning , 2022, NeurIPS.

[3]  Michael I. Jordan,et al.  ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning , 2021, ArXiv.

[4]  Ben M. Hambly,et al.  Recent advances in reinforcement learning in finance , 2021, SSRN Electronic Journal.

[5]  Anwar Elwalid,et al.  FinRL-podracer: high performance and scalable deep reinforcement learning for quantitative finance , 2021, ICAIF.

[6]  Hongyang Yang,et al.  FinRL: deep reinforcement learning framework to automate trading in quantitative finance , 2021, ICAIF.

[7]  Yosef Bonaparte Introducing the Cryptocurrency VIX: CVIX , 2021, SSRN Electronic Journal.

[8]  Marc G. Bellemare,et al.  Deep Reinforcement Learning at the Edge of the Statistical Precipice , 2021, NeurIPS.

[9]  G. Varoquaux,et al.  Machine learning for medical imaging: methodological failures and recommendations for the future , 2021, npj Digital Medicine.

[10]  Tal Arbel,et al.  Accounting for Variance in Machine Learning Benchmarks , 2021, MLSys.

[11]  Jimmy J. Lin,et al.  Significant Improvements over the State of the Art? A Case Study of the MS MARCO Document Ranking Leaderboard , 2021, SIGIR.

[12]  Xiao-Yang Liu,et al.  Deep reinforcement learning for automated stock trading: an ensemble strategy , 2020, ICAIF.

[13]  Elie Bouri,et al.  The profitability of technical trading rules in the Bitcoin market , 2020 .

[14]  David Martínez-Rego,et al.  Cryptocurrency trading: a comprehensive survey , 2020, Financial Innovation.

[15]  Ali Farhadi,et al.  Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping , 2020, ArXiv.

[16]  M. Kâafar,et al.  Modelling and Quantifying Membership Information Leakage in Machine Learning , 2020, ArXiv.

[17]  Anoop Korattikara Balan,et al.  Measuring the Reliability of Reinforcement Learning Algorithms , 2019, ICLR.

[18]  Zihao Zhang,et al.  Deep Reinforcement Learning for Trading , 2019, The Journal of Financial Data Science.

[19]  Yinchuan Li,et al.  Price Prediction of Cryptocurrency: An Empirical Study , 2019, SmartBlock.

[20]  Dong Gu Choi,et al.  An intelligent financial portfolio trading strategy using deep Q-learning , 2019, Expert Syst. Appl..

[21]  Yinchuan Li,et al.  Optimistic Bull or Pessimistic Bear: Adaptive Deep Reinforcement Learning for Stock Portfolio Allocation , 2019, 1907.01503.

[22]  John Foley,et al.  Let's Play Again: Variability of Deep Reinforcement Learning Agents in Atari Environments , 2019, ArXiv.

[23]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[24]  Xiao-Yang Liu,et al.  Practical Deep Reinforcement Learning Approach for Stock Trading , 2018, ArXiv.

[25]  Marwan Mattar,et al.  Unity: A General Platform for Intelligent Agents , 2018, ArXiv.

[26]  Aleh Tsyvinski,et al.  Risks and Returns of Cryptocurrency , 2018, The Review of Financial Studies.

[27]  Ion Stoica,et al.  Tune: A Research Platform for Distributed Model Selection and Training , 2018, ArXiv.

[28]  Christian Conrad,et al.  Long- and Short-Term Cryptocurrency Volatility Components: A GARCH-MIDAS Analysis , 2018 .

[29]  Benjamin Recht,et al.  Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.

[30]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[31]  Marcos M. López de Prado,et al.  Advances in Financial Machine Learning: Numerai's Tournament (seminar slides) , 2018, SSRN Electronic Journal.

[32]  Michael I. Jordan,et al.  RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.

[33]  A. Murat Ozbayoglu,et al.  A deep learning based stock trading model with 2-D CNN trend detection , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[34]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[35]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[36]  Youyong Kong,et al.  Deep Direct Reinforcement Learning for Financial Signal Representation and Trading , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Zhengyao Jiang,et al.  Cryptocurrency portfolio management with deep reinforcement learning , 2016, 2017 Intelligent Systems Conference (IntelliSys).

[38]  J. Schulman,et al.  OpenAI Gym , 2016, ArXiv.

[39]  Alan Moreira,et al.  Volatility Managed Portfolios , 2016 .

[40]  Tomaso Aste,et al.  Time-dependent scaling patterns in high frequency financial data , 2015, 1508.07428.

[41]  David H. Bailey,et al.  The Probability of Backtest Overfitting , 2015 .

[42]  David H. Bailey,et al.  Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance , 2014 .

[43]  Hujun Yin,et al.  Exchange rate prediction using hybrid neural networks and trading indicators , 2009, Neurocomputing.

[44]  W. LoAndrew “The Statistics of Sharpe Ratios”: Author's Response , 2006 .

[45]  Lawrence E. Barker,et al.  Logit Models From Economics and Other Fields , 2005, Technometrics.

[46]  J. Moody,et al.  Learning to trade via direct reinforcement , 2001, IEEE Trans. Neural Networks.

[47]  Michael W. Browne,et al.  Cross-Validation Methods. , 2000, Journal of mathematical psychology.

[48]  Chelsea C. White,et al.  A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..

[49]  C. C. Heyde,et al.  On a Property of the Lognormal Distribution , 1963 .

[50]  Jerzy Neyman,et al.  The testing of statistical hypotheses in relation to probabilities a priori , 1933, Mathematical Proceedings of the Cambridge Philosophical Society.

[51]  A. Gleave,et al.  Stable-Baselines3: Reliable Reinforcement Learning Implementations , 2021, J. Mach. Learn. Res..