论文信息 - A reinforcement learning extension to the Almgren-Chriss framework for optimal trade execution - 字舞流文

A reinforcement learning extension to the Almgren-Chriss framework for optimal trade execution

Reinforcement learning is explored as a candidate machine learning technique to enhance existing analytical solutions for optimal trade execution with elements from the market microstructure. Given a volume-to-trade, fixed time horizon and discrete trading periods, the aim is to adapt a given volume trajectory such that it is dynamic with respect to favourable/unfavourable conditions during realtime execution, thereby improving overall cost of trading. We consider the standard Almgren-Chriss model with linear price impact as a candidate base model. This model is popular amongst sell-side institutions as a basis for arrival price benchmark execution algorithms. By training a learning agent to modify a volume trajectory based on the market's prevailing spread and volume dynamics, we are able to improve post-trade implementation shortfall by up to 10.3% on average compared to the base model, based on a sample of stocks and trade sizes in the South African equity market.

Dieter Hendricks | Diane Wilcox | D. Wilcox | D. Hendricks

[1] Robert W. Holthausen,et al. Large-block transactions, the speed of response, and temporary and permanent stock-price effects , 1990 .

[2] P. B. Coaker,et al. Applied Dynamic Programming , 1964 .

[3] James McCulloch,et al. Relative volume as a doubly stochastic binomial point process , 2004 .

[4] Andrew G. Barto,et al. Reinforcement learning , 1998 .

[5] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[6] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[7] Andre F. Perold,et al. The implementation shortfall , 1988 .

[8] Thomas G. Dietterich. An Overview of MAXQ Hierarchical Reinforcement Learning , 2000, SARA.

[9] Louis K.C. Chan,et al. The Behavior of Stock Prices Around Institutional Trades , 1993 .

[10] Frédérick Garcia,et al. A Learning Rate Analysis of Reinforcement Learning Algorithms in Finite-Horizon , 1998, ICML.

[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[12] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[13] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[14] William A. Brock,et al. Periodic market closure and trading volume: A model of intraday bids and asks☆ , 1992 .

[15] Frank de Jong,et al. Aggressive Orders and the Resiliency of a Limit Order Market , 2005 .

[16] U. Rieder,et al. Markov Decision Processes , 2010 .

[17] Abhijit Gosavi,et al. Reinforcement Learning: A Tutorial Survey and Recent Advances , 2009, INFORMS J. Comput..

[18] MahadevanSridhar,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003 .

[19] R Bellman,et al. On the Theory of Dynamic Programming. , 1952, Proceedings of the National Academy of Sciences of the United States of America.

[20] Robert Almgren,et al. Optimal execution with nonlinear impact functions and trading-enhanced risk , 2003 .

[21] Michael Kearns,et al. Reinforcement learning for optimized trade execution , 2006, ICML.

[22] Alexander Fadeev,et al. Optimal execution for portfolio transactions , 2006 .

[23] Anat R. Admati,et al. A Theory of Intraday Patterns: Volume and Price Variability , 1988 .

[24] Gur Huberman,et al. Optimal Liquidity Trading , 2000 .

[25] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[26] Dimitri Vayanos,et al. Strategic trading in a dynamic noisy market , 2001 .

[27] D. Bertsimas,et al. Optimal control of execution costs , 1998 .