Industry Return Predictability: A Machine Learning Approach

In this article, the authors use machine learning tools to analyze industry return predictability based on the information in lagged industry returns. Controlling for post-selection inference and multiple testing, they find significant in-sample evidence of industry return predictability. Lagged returns for the financial sector and commodity- and material-producing industries exhibit widespread predictive ability, consistent with the gradual diffusion of information across economically linked industries. Out-of-sample industry return forecasts that incorporate the information in lagged industry returns are economically valuable: Controlling for systematic risk using leading multifactor models from the literature, an industry-rotation portfolio that goes long (short) industries with the highest (lowest) forecasted returns delivers an annualized alpha of over 8%. The industry-rotation portfolio also generates substantial gains during economic downturns, including the Great Recession. TOPICS: Big data/machine learning, analysis of individual factors/risk premia, portfolio construction, performance measurement

[1]  Ali Shojaie,et al.  In Defense of the Indefensible: A Very Naïve Approach to High-Dimensional Inference. , 2017, Statistical science : a review journal of the Institute of Mathematical Statistics.

[2]  I. Welch,et al.  A Comprehensive Look at the Empirical Performance of Equity Premium Prediction II , 2004, SSRN Electronic Journal.

[3]  Joachim Freyberger,et al.  Dissecting Characteristics Nonparametrically , 2017, The Review of Financial Studies.

[4]  Harry Markowitz,et al.  A Backtesting Protocol in the Era of Machine Learning , 2018, The Journal of Financial Data Science.

[5]  Bryan T. Kelly,et al.  Empirical Asset Pricing Via Machine Learning , 2018, The Review of Financial Studies.

[6]  Mao Ye,et al.  Sparse Signals in the Cross-Section of Returns , 2017, The Journal of Finance.

[7]  Juhani T. Linnainmaa,et al.  The History of the Cross Section of Stock Returns , 2016 .

[8]  John R. M. Hand,et al.  The Characteristics that Provide Independent Information about Average U.S. Monthly Stock Returns , 2016 .

[9]  Jonathan Taylor,et al.  Statistical learning and selective inference , 2015, Proceedings of the National Academy of Sciences.

[10]  N. Meinshausen,et al.  High-Dimensional Inference: Confidence Intervals, $p$-Values and R-Software hdi , 2014, 1408.4026.

[11]  Campbell R. Harvey,et al.  . . . And the Cross-Section of Expected Returns , 2014 .

[12]  E. Fama,et al.  A Five-Factor Asset Pricing Model , 2014 .

[13]  J. Lewellen The Cross Section of Expected Stock Returns , 2014 .

[14]  R. Tibshirani,et al.  Exact Post-Selection Inference for Sequential Regression Procedures , 2014, 1401.3889.

[15]  H. Leeb,et al.  On various confidence intervals post-model-selection , 2014, 1401.2267.

[16]  Dennis L. Sun,et al.  Exact post-selection inference, with application to the lasso , 2013, 1311.6238.

[17]  A. Buja,et al.  Valid post-selection inference , 2013, 1306.1059.

[18]  Cheryl J. Flynn,et al.  Efficiency for Regularization Parameter Selection in Penalized Likelihood Estimation of Misspecified Models , 2013, 1302.2068.

[19]  Frank J. Fabozzi,et al.  Forecasting Stock Returns , 2012 .

[20]  Andrea Frazzini,et al.  Trading Costs of Asset Pricing Anomalies , 2012 .

[21]  Lu Zhang,et al.  Digesting Anomalies: An Investment Approach , 2012 .

[22]  Victor Chernozhukov,et al.  High Dimensional Sparse Econometric Models: An Introduction , 2011, 1106.5242.

[23]  Guofu Zhou,et al.  International Stock Return Predictability: What is the Role of the United States? , 2010 .

[24]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[25]  Gilles Blanchard,et al.  Adaptive False Discovery Rate Control under Independence and Dependence , 2009, J. Mach. Learn. Res..

[26]  Lior Menzly,et al.  Market Segmentation and Cross-Predictability of Returns , 2009 .

[27]  Bruce D. Phelps A Comprehensive Look at the Empirical Performance of Equity Premium Prediction , 2009 .

[28]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[29]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[30]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[31]  L. Wasserman,et al.  HIGH DIMENSIONAL VARIABLE SELECTION. , 2007, Annals of statistics.

[32]  Peter Bühlmann,et al.  p-Values for High-Dimensional Regression , 2008, 0811.2177.

[33]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[34]  S. B. Thompson,et al.  Predicting Excess Stock Returns Out of Sample: Can Anything Beat the Historical Average? , 2008 .

[35]  A. Farcomeni Some Results on the Control of the False Discovery Rate under Dependence , 2007 .

[36]  Andrea Frazzini,et al.  Economic Links and Predictable Returns , 2007 .

[37]  W. Torous,et al.  Do industries lead stock markets , 2007 .

[38]  Nicolai Meinshausen,et al.  Relaxed Lasso , 2007, Comput. Stat. Data Anal..

[39]  Y. Benjamini,et al.  Adaptive linear step-up procedures that control the false discovery rate , 2006 .

[40]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[41]  William N. Goetzmann,et al.  Portfolio Performance Manipulation and Manipulation-Proof Performance Measures , 2004 .

[42]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[43]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[44]  Doron Avramov,et al.  Stock Return Predictability and Asset Pricing Models , 2003 .

[45]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[46]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[47]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[48]  Mark Grinblatt,et al.  Do Industries Explain Momentum , 1999 .

[49]  J. Stein,et al.  A Unified Theory of Underreaction, Momentum Trading and Overreaction in Asset Markets , 1997 .

[50]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[51]  Robert A. Korajczyk,et al.  Do Arbitrage Pricing Models Explain the Predictability of Stock Returns? , 1995 .

[52]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[53]  Narasimhan Jegadeesh,et al.  Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency , 1993 .

[54]  E. Fama,et al.  Common risk factors in the returns on stocks and bonds , 1993 .

[55]  Sheridan Titman,et al.  On Persistence in Mutual Fund Performance , 1997 .

[56]  Campbell R. Harvey,et al.  The Variation of Economic Risk Premiums , 1990, Journal of Political Economy.

[57]  Y. Benjamini,et al.  More powerful procedures for multiple significance testing. , 1990, Statistics in medicine.

[58]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[59]  R. C. Merton,et al.  Presidential Address: A simple model of capital market equilibrium with incomplete information , 1987 .

[60]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[61]  G. Box Robustness in the Strategy of Scientific Model Building. , 1979 .

[62]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[63]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .