Large Dimensional Covariance Matrix Estimation Via a Factor Model

High dimensionality comparable to sample size is common in many statistical problems. We examine covariance matrix estimation in the asymptotic framework that the dimensionality p tends to infinity as the sample size n increases. Motivated by the Arbitrage Pricing Theory in finance, a multi-factor model is employed to reduce dimensionality and to estimate the covariance matrix. The factors are observable and the number of factors K is allowed to grow with p. We investigate impact of p and K on the performance of the model-based covariance matrix estimator. Under mild assumptions, we have established convergence rates and asymptotic normality of the model-based estimator. Its performance is compared with that of the sample covariance matrix. We identify situations under which the factor approach increases performance substantially or marginally. The impacts of covariance matrix estimation on portfolio allocation and risk management are studied. The asymptotic results are supported by a thorough simulation study.

[1]  A. Stuart,et al.  Portfolio Selection: Efficient Diversification of Investments , 1959 .

[2]  John T. Scott Factor Analysis and Regression , 1966 .

[3]  John T. Scott Factor Analysis Regression Revisited , 1969 .

[4]  P. J. Huber Robust Regression: Asymptotics, Conjectures and Monte Carlo , 1973 .

[5]  Clifford S. Stein Estimation of a covariance matrix , 1975 .

[6]  S. Ross The arbitrage theory of capital asset pricing , 1976 .

[7]  S. Ross THE CAPITAL ASSET PRICING MODEL (CAPM), SHORT‐SALE RESTRICTIONS AND RELATED ISSUES , 1977 .

[8]  V. Yohai,et al.  ASYMPTOTIC BEHAVIOR OF M-ESTIMATORS FOR THE LINEAR MODEL , 1979 .

[9]  R. Engle,et al.  A One-Factor Multivariate Time Series Model of Metropolitan Wage Rates , 1981 .

[10]  M. Rothschild,et al.  Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets , 1982 .

[11]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[12]  Gary Chamberlain,et al.  FUNDS, FACTORS, AND DIVERSIFICATION IN ARBITRAGE PRICING MODELS , 1983 .

[13]  S. Portnoy Asymptotic behavior of M-estimators of p regression parameters when p , 1985 .

[14]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[15]  F. Diebold,et al.  The dynamics of exchange rate volatility: a multivariate latent factor ARCH model , 1986 .

[16]  A. Shapiro,et al.  Adjustments for kurtosis in factor analysis with elliptically distributed errors , 1987 .

[17]  M. Browne Robustness of statistical inference in factor analysis and related models , 1987 .

[18]  David E. Tyler,et al.  ON WIELANDT'S INEQUALITY AND ITS APPLICATION TO THE ASYMPTOTIC DISTRIBUTION OF THE EIGENVALUES OF A RANDOM SYMMETRIC MATRIX , 1991 .

[19]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[20]  E. Fama,et al.  Common risk factors in the returns on stocks and bonds , 1993 .

[21]  M. L. Eaton,et al.  The asymptotic distribution of singular values with applications to canonical correlations and correspondence analysis , 1994 .

[22]  Ke-Hai Yuan,et al.  Mean and Covariance Structure Analysis: Theoretical and Practical Improvements , 1997 .

[23]  P J Diggle,et al.  Nonparametric estimation of covariance structure in longitudinal data. , 1998, Biometrics.

[24]  T. Andersen THE ECONOMETRICS OF FINANCIAL MARKETS , 1998, Econometric Theory.

[25]  M. Pourahmadi Maximum likelihood estimation of generalised linear models for multivariate normal covariance matrix , 2000 .

[26]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[27]  M. West,et al.  Bayesian Dynamic Factor Models and Portfolio Allocation , 2000 .

[28]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[29]  Robert J. Boik,et al.  Spectral models for covariance matrices , 2002 .

[30]  R. Kohn,et al.  Parsimonious Covariance Matrix Estimation for Longitudinal Data , 2002 .

[31]  J. Bai,et al.  Inferential Theory for Factor Models of Large Dimensions , 2003 .

[32]  Donald Goldfarb,et al.  Robust Portfolio Selection Problems , 2003, Math. Oper. Res..

[33]  M. Pourahmadi,et al.  Nonparametric estimation of large covariance matrices of longitudinal data , 2003 .

[34]  R. Kohn,et al.  Efficient estimation of covariance selection models , 2003 .

[35]  Jianqing Fan Rejoinder: A selective overview of nonparametric methods in financial econometrics , 2004, math/0411034.

[36]  Jianqing Fan,et al.  Nonconcave penalized likelihood with a diverging number of parameters , 2004, math/0406466.

[37]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[38]  Jianhua Z. Huang Covariance selection and estimation via penalised normal likelihood , 2005 .

[39]  Hongzhe Li,et al.  Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks. , 2006, Biostatistics.

[40]  Runze Li,et al.  Statistical Challenges with High Dimensionality: Feature Selection in Knowledge Discovery , 2006, math/0602133.