A sparse regression and neural network approach for financial factor modeling

Abstract Factor models are central to understanding risk-return trade-offs in finance. Since Fama and French (1993), hundreds of factors have been found to have explanatory power for asset pricing. To construct a factor model, two tasks have to be performed: Feature Selection, selecting a small subset given a large number of factors to overcome overfitting in regression, and Feature Engineering, determining the interactions between the factors. In this work, the process of constructing factor models (not the factors themselves) is examined. A unified, two-step process of dimensionality reduction and nonlinear transformation that produces parsimonious, general factor models is proposed. Comparisons between frameworks implementing linear feature selection models as well as non-linear feature reduction techniques are conducted. A second stage generalizes the models by learning nonlinear interactions. The framework attempts to strike a balance between accuracy and interpretability. Results of computational experiments on historical financial data, on three models of varying degrees of non-linearity and interpretability suggest that mixed-integer-programming-based formulations are suitable for the task of linear financial factor selection and that the second-stage nonlinearity due to neural networks improves accuracy.

[1]  Campbell R. Harvey,et al.  . . . And the Cross-Section of Expected Returns , 2014 .

[2]  E. Fama,et al.  A Five-Factor Asset Pricing Model , 2014 .

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  R. Tibshirani,et al.  REJOINDER TO "LEAST ANGLE REGRESSION" BY EFRON ET AL. , 2004, math/0406474.

[5]  Bryan T. Kelly,et al.  Some Characteristics Are Risk Exposures, and the Rest Are Irrelevant , 2017 .

[6]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[7]  Markus Pelger,et al.  Deep Learning in Asset Pricing , 2019, Manag. Sci..

[8]  Kiyoshi Izumi,et al.  Deep Recurrent Factor Model: Interpretable Non-Linear and Time-Varying Multi-Factor Model , 2019, ArXiv.

[9]  R. Tibshirani,et al.  Extended Comparisons of Best Subset Selection, Forward Stepwise Selection, and the Lasso , 2017, 1707.08692.

[10]  Asriel E. Levin,et al.  Stock Selection via Nonlinear Multi-Factor Models , 1995, NIPS.

[11]  J. Lintner THE VALUATION OF RISK ASSETS AND THE SELECTION OF RISKY INVESTMENTS IN STOCK PORTFOLIOS AND CAPITAL BUDGETS , 1965 .

[12]  S. Ross The arbitrage theory of capital asset pricing , 1976 .

[13]  Characteristics Are Covariances: A Unified Model of Risk and Return , 2018 .

[14]  R. Wilcox Introduction to Robust Estimation and Hypothesis Testing , 1997 .

[15]  Bryan T. Kelly,et al.  Empirical Asset Pricing Via Machine Learning , 2018, The Review of Financial Studies.

[16]  Thomas Fischer,et al.  Deep learning with long short-term memory networks for financial market predictions , 2017, Eur. J. Oper. Res..

[17]  Dacheng Xiu,et al.  Taming the Factor Zoo: A Test of New Factors , 2017, The Journal of Finance.

[18]  Jianjun Xu,et al.  Deep Learning with Gated Recurrent Unit Networks for Financial Sequence Predictions , 2018 .

[19]  Yulei Rao,et al.  A deep learning framework for financial time series using stacked autoencoders and long-short term memory , 2017, PloS one.

[20]  Daniel Ferreira,et al.  Spurious Factors in Linear Asset Pricing Models , 2015 .

[21]  Mark M. Carhart On Persistence in Mutual Fund Performance , 1997 .

[22]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[23]  T. Gebbie,et al.  Learning low-frequency temporal patterns for quantitative trading , 2020, 2020 IEEE Symposium Series on Computational Intelligence (SSCI).

[24]  Matthew F. Dixon,et al.  Deep Fundamental Factor Models , 2019, ArXiv.

[25]  D. Bertsimas,et al.  Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[26]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[27]  J. Mossin EQUILIBRIUM IN A CAPITAL ASSET MARKET , 1966 .

[28]  Lei Qi,et al.  Sparse High Dimensional Models in Economics. , 2011, Annual review of economics.

[29]  E. Fama,et al.  Common risk factors in the returns on stocks and bonds , 1993 .

[30]  Oussama Lachiheb,et al.  A hierarchical Deep neural network design for stock returns prediction , 2018, KES.

[31]  W. Sharpe CAPITAL ASSET PRICES: A THEORY OF MARKET EQUILIBRIUM UNDER CONDITIONS OF RISK* , 1964 .

[32]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[33]  Campbell R. Harvey,et al.  Editor's Choice … and the Cross-Section of Expected Returns , 2016 .

[34]  Jingyu He,et al.  Deep Learning for Predicting Asset Returns , 2018, ArXiv.

[35]  John R. M. Hand,et al.  The supraview of return predictive signals , 2013 .

[36]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .