High-dimensionality effects in the Markowitz problem and other quadratic programs with linear equality constraints: risk underestimation

We study the properties of solutions of quadratic programs with linear equality constraints whose parameters are estimated from data in the high-dimensional setting where p, the number of variables in the problem, is of the same order of magnitude as n, the number of observations used to estimate the parameters. The Markowitz problem in Finance is a subcase of our study. Assuming normality and independence of the observations we relate the efficient frontier computed empirically to the “true” efficient frontier. Our computations show that there is a separation of the errors induced by estimating the mean of the observations and estimating the covariance matrix. In particular, the price paid for estimating the covariance matrix is an underestimation of the variance by a factor roughly equal to 1 − p/n. Therefore the risk of the optimal population solution is underestimated when we estimate it by solving a similar quadratic program with estimated parameters. We also characterize the statistical behavior of linear functionals of the empirical optimal vector and show that they are biased estimators of the corresponding population quantities. We investigate the robustness of our Gaussian results by extending the study to certain elliptical models and models where our n observations are correlated (in “time”). We show a lack of robustness of the Gaussian results, but are still able to get results concerning first order properties of the quantities of interest, even in the case of relatively heavy-tailed data (we require two moments). Risk underestimation is still present in the elliptical case and more pronounced that in the Gaussian case. We discuss properties of the non-parametric and parametric bootstrap in this context. We show several results, including the interesting fact that standard applications of the bootstrap generally yields inconsistent estimates of bias. Finally, we propose some strategies to correct these problems and practically validate them in some simulations. In all the paper, we will assume that p, n and n− p tend to infinity, and p < n.

[1]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[2]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[3]  Noureddine El Karoui,et al.  Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond , 2009, 0912.1950.

[4]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[5]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[6]  Raymond Kan,et al.  The Distribution of the Sample Minimum-Variance Frontier , 2007, Manag. Sci..

[7]  Noureddine El Karoui,et al.  Operator norm consistent estimation of large-dimensional sparse covariance matrices , 2008, 0901.3220.

[8]  Noureddine El Karoui Spectrum estimation for large dimensional covariance matrices using random matrix theory , 2006, math/0609418.

[9]  J. Bouchaud,et al.  RANDOM MATRIX THEORY AND FINANCIAL CORRELATIONS , 2000 .

[10]  G. Lugosi,et al.  On Concentration-of-Measure Inequalities , 1998 .

[11]  Robert B. Litterman,et al.  Asset Allocation , 1991 .

[12]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[13]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[14]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[15]  I. Kondor,et al.  Noisy Covariance Matrices and Portfolio Optimization II , 2002, cond-mat/0205119.

[16]  J. Jobson,et al.  Estimation for Markowitz Efficient Portfolios , 1980 .

[17]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[18]  Gabriel Frahm,et al.  Random matrix theory and robust covariance matrix estimation for financial data , 2005, physics/0503007.

[19]  Imre Kondor,et al.  Noisy covariance matrices and portfolio optimization , 2002 .

[20]  K. Wachter The Strong Limits of Random Matrix Spectra for Sample Matrices of Independent Elements , 1978 .

[21]  Noureddine El Karoui Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices , 2005, math/0503109.

[22]  A. Meucci Risk and asset allocation , 2005 .

[23]  Attilio Meucci Enhancing the Black–Litterman and related approaches: Views and stress-test on risk factors , 2009 .

[24]  K. Athreya,et al.  Measure Theory and Probability Theory (Springer Texts in Statistics) , 2006 .

[25]  Richard O. Michaud,et al.  Efficient Asset Management: A Practical Guide to Stock Portfolio Optimization and Asset Allocation , 1998 .

[26]  T. Lai,et al.  Statistical Models and Methods for Financial Markets , 2008 .

[27]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[28]  T. Andersen THE ECONOMETRICS OF FINANCIAL MARKETS , 1998, Econometric Theory.

[29]  Z. Bai METHODOLOGIES IN SPECTRAL ANALYSIS OF LARGE DIMENSIONAL RANDOM MATRICES , A REVIEW , 1999 .

[30]  Theofanis Sapatinas Statistics and Finance , 2005 .

[31]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[32]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[33]  Paul Embrechts,et al.  Quantitative Risk Management , 2011, International Encyclopedia of Statistical Science.

[34]  Y. Chikuse Statistics on special manifolds , 2003 .

[35]  J. W. Silverstein Strong convergence of the empirical distribution of eigenvalues of large dimensional random matrices , 1995 .

[36]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[37]  S. Kotz,et al.  Symmetric Multivariate and Related Distributions , 1989 .