Issues in Applying Financial Econometrics to Factor-Based Modeling in Investment Management

In this article, the authors provide a nontechnical discussion of a number of practical and theoretical issues associated with implementing factor models used to explain or forecast equity returns. The first issue is determining the number of factors (i.e., the number of variables needed to explain or forecast returns). In finite markets such as stock markets, the problem of determining the true number of factors cannot be solved theoretically. Instead, asset managers must be content with approximations using model selection criteria. The authors then discuss the questions of overfitting and dimensionality reduction—both of which can lead to poor out-of-sample performance of investment or trading strategies. Overfitting entails using a model that is too complex for the data available to the modeler; thus, the resulting model fits noise. Dimensionality reduction solves the problem of dimensionality by using approximate models of reduced dimensionality that can be estimated with small samples. An important instance of applying dimensionality reduction techniques is using factor GARCH models to forecast covariance matrices. Finally, the authors discuss problems associated with backtesting. In trying to choose the best-performing model or strategy, a modeler may be tempted to run multiple backtests, thereby creating the risk of using out-of-sample backtesting as a form of in-sample testing. In turn, this leads to overfitting.

[1]  Yan Liu,et al.  Evaluating Trading Strategies , 2014, The Journal of Portfolio Management.

[2]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[3]  A. Onatski TESTING HYPOTHESES ABOUT THE NUMBER OF FACTORS IN LARGE FACTOR MODELS , 2009 .

[4]  David H. Bailey,et al.  The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting, and Non-Normality , 2014, The Journal of Portfolio Management.

[5]  Tim Bollerslev,et al.  COMMON PERSISTENCE IN CONDITIONAL VARIANCES , 1993 .

[6]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[7]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[8]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences: statistical considerations , 1969, JACM.

[9]  R. Engle,et al.  Implied ARCH models from options prices , 1992 .

[10]  Olivier Ledoit,et al.  Honey, I Shrunk the Sample Covariance Matrix , 2003 .

[11]  J. Wooldridge,et al.  Dynamic Conditional Beta , 2014 .

[12]  David H. Bailey,et al.  Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance , 2014 .

[13]  Daniel Peña,et al.  Nonstationary dynamic factor analysis , 2006 .

[14]  R. Engle,et al.  Multivariate Simultaneous Generalized ARCH , 1995, Econometric Theory.

[15]  M. Rothschild,et al.  Asset Pricing with a Factor Arch Covariance Structure: Empirical Estimates for Treasury Bills , 1988 .

[16]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[17]  T. Bollerslev,et al.  Modelling the Coherence in Short-run Nominal Exchange Rates: A Multivariate Generalized ARCH Model , 1990 .

[18]  R. Engle,et al.  Dynamic Conditional Beta , 2012 .

[19]  Robert F. Engle,et al.  A multi-dynamic-factor model for stock returns , 1992 .

[20]  Theo K. Dijkstra,et al.  Pyrrho's lemma, or have it your way , 1995 .

[21]  R. Engle Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation , 1982 .

[22]  V. Marčenko,et al.  DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .

[23]  George Kapetanios,et al.  A Testing Procedure for Determining the Number of Factors in Approximate Factor Models With Large Datasets , 2010 .