Using Principal Component Analysis to Estimate a High Dimensional Factor Model with High-Frequency Data

This paper constructs an estimator for the number of common factors in a setting where both the sampling frequency and the number of variables increase. Empirically, we document that the covariance matrix of a large portfolio of US equities is well represented by a low rank common structure with sparse residual matrix. When employed for out-of-sample portfolio allocation, the proposed estimator largely outperforms the sample covariance estimator.

[1]  M. Rothschild,et al.  Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets , 1982 .

[2]  M. Hallin,et al.  The Generalized Dynamic-Factor Model: Identification and Estimation , 2000, Review of Economics and Statistics.

[3]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[4]  S. Ross,et al.  Economic Forces and the Stock Market , 1986 .

[5]  F. Dias,et al.  Determining the number of factors in approximate factor models with global and group-specific factors , 2008 .

[6]  I. Daubechies,et al.  Sparse and stable Markowitz portfolios , 2007, Proceedings of the National Academy of Sciences.

[7]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[8]  Noureddine El Karoui,et al.  High-dimensionality effects in the Markowitz problem and other quadratic programs with linear constraints: Risk underestimation , 2010, 1211.2917.

[9]  Qiwei Yao,et al.  Large Volatility Matrix Inference via Combining Low-Frequency and High-Frequency Approaches , 2011 .

[10]  Raman Uppal,et al.  A Generalized Approach to Portfolio Optimization: Improving Performance by Constraining Portfolio Norms , 2009, Manag. Sci..

[11]  Dacheng Xiu,et al.  Principal Component Analysis of High Frequency Data , 2016 .

[12]  Mark W. Watson,et al.  Consistent Estimation of the Number of Dynamic Factors in a Large N and T Panel , 2007 .

[13]  J. Bai,et al.  Inferential Theory for Factor Models of Large Dimensions , 2003 .

[14]  R. Jagannathan,et al.  Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps , 2002 .

[15]  Nikolaus Hautsch,et al.  Estimating the Quadratic Covariation Matrix from Noisy Observations: Local Method of Moments and Efficiency , 2013 .

[16]  É. Renault,et al.  Dynamic factor models , 2004 .

[17]  N. Yoshida,et al.  On covariance estimation of non-synchronously observed diffusion processes , 2005 .

[18]  Victor DeMiguel,et al.  Optimal Versus Naive Diversification: How Inefficient is the 1/N Portfolio Strategy? , 2009 .

[19]  Jianqing Fan,et al.  High Dimensional Covariance Matrix Estimation in Approximate Factor Models , 2011, Annals of statistics.

[20]  Neil Shephard,et al.  Econometric Analysis of Multivariate Realised QML: Estimation of the Covariation of Equity Prices under Asynchronous Trading , 2016 .

[21]  Z. Bai,et al.  Limit of the smallest eigenvalue of a large dimensional sample covariance matrix , 1993 .

[22]  Michael W. Brandt,et al.  Variable Selection for Portfolio Choice , 2001 .

[23]  S. Ross The arbitrage theory of capital asset pricing , 1976 .

[24]  Harrison H. Zhou,et al.  Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation , 2016 .

[25]  Nicholas G. Polson,et al.  Simulation-based-Estimation in Portfolio Selection , 2009 .

[26]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[27]  Jianqing Fan,et al.  High dimensional covariance matrix estimation using a factor model , 2007, math/0701124.

[28]  Michael P. Clements,et al.  Dynamic Factor Models , 2011, Financial Econometrics.

[29]  J. Jacod,et al.  High-Frequency Financial Econometrics , 2014 .

[30]  Catherine Doz,et al.  A Two-Step Estimator for Large Approximate Dynamic Factor Models Based on Kalman Filtering , 2007 .

[31]  Marco Lippi,et al.  The generalized dynamic factor model: consistency and rates , 2004 .

[32]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[33]  Olivier Ledoit,et al.  Improved estimation of the covariance matrix of stock returns with an application to portfolio selection , 2003 .

[34]  Tim Bollerslev,et al.  Jumps and Betas: A New Framework for Disentangling and Estimating Systematic Risks , 2007 .

[35]  Ke Yu,et al.  Constraints , 2019, Sexual Selection.

[36]  Mark Podolskij,et al.  Pre-Averaging Estimators of the Ex-Post Covariance Matrix in Noisy Diffusion Models with Non-Synchronous Data , 2010 .

[37]  Jianqing Fan,et al.  High-Frequency Covariance Estimates With Noisy and Asynchronous Financial Data , 2010 .

[38]  Markus Pelger Understanding Systematic Risk: A High-Frequency Approach , 2019 .

[39]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[40]  E. Fama,et al.  Common risk factors in the returns on stocks and bonds , 1993 .

[41]  Harrison H. Zhou,et al.  OPTIMAL SPARSE VOLATILITY MATRIX ESTIMATION FOR HIGH-DIMENSIONAL ITÔ PROCESSES WITH MEASUREMENT ERRORS , 2013, 1309.4889.

[42]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[43]  LE,et al.  Multivariate Realised Kernels: Consistent Positive Semi-Definite Estimators of the Covariation of Equity Prices with Noise and Non-Synchronous Trading , 2018 .

[44]  Lan Zhang Estimating Covariation: Epps Effect, Microstructure Noise , 2006 .

[45]  Axel Gandy,et al.  THE EFFECT OF ESTIMATION IN HIGH‐DIMENSIONAL PORTFOLIOS , 2013 .

[46]  Pedro Santa-Clara,et al.  Parametric Portfolio Policies: Exploiting Characteristics in the Cross Section of Equity Returns , 2009 .

[47]  Marco Lippi,et al.  OPENING THE BLACK BOX: STRUCTURAL FACTOR MODELS WITH LARGE CROSS SECTIONS , 2009, Econometric Theory.

[48]  Minjing Tao,et al.  FAST CONVERGENCE RATES IN ESTIMATING LARGE VOLATILITY MATRICES USING HIGH-FREQUENCY FINANCIAL DATA , 2013, Econometric Theory.

[49]  Haipeng Xing,et al.  Mean-Variance Portfolia Optimization When Means and Covariances are Unknown , 2010 .

[50]  Lorenzo Trapani,et al.  A Randomized Sequential Procedure to Determine the Number of Factors , 2018, Journal of the American Statistical Association.

[51]  M. Rothschild,et al.  Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets , 1983 .

[52]  M. Pesaran,et al.  Optimal Asset Allocation with Factor Models for Large Portfolios , 2008, SSRN Electronic Journal.

[53]  Marco Lippi,et al.  THE GENERALIZED DYNAMIC FACTOR MODEL: REPRESENTATION THEORY , 2001, Econometric Theory.

[54]  Adam J. Rothman,et al.  Generalized Thresholding of Large Covariance Matrices , 2009 .

[55]  P. Fryzlewicz High-dimensional volatility matrix estimation via wavelets and thresholding , 2013 .

[56]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[57]  Pedro Santa-Clara,et al.  Parametric Portfolio Policies: Exploiting Characteristics in the Cross Section of Equity Returns , 2004 .

[58]  A. Onatski Determining the Number of Factors from Empirical Distribution of Eigenvalues , 2010, The Review of Economics and Statistics.

[59]  T. W. Anderson,et al.  An Introduction to Multivariate Statistical Analysis , 1959 .

[60]  P. Mykland,et al.  ANOVA for diffusions and Itô processes , 2006, math/0611274.

[61]  J. Stock,et al.  Forecasting Using Principal Components From a Large Number of Predictors , 2002 .

[62]  Jean Jacod,et al.  Discretization of Processes , 2011 .

[63]  Gregory Connor,et al.  Risk and Return in an Equilibrium Apt: Application of a New Test Methodology , 1988 .

[64]  Olivier Ledoit,et al.  Nonlinear Shrinkage Estimation of Large-Dimensional Covariance Matrices , 2011, 1207.5322.

[65]  Jianqing Fan,et al.  Incorporating Global Industrial Classification Standard into Portfolio Allocation: A Simple Factor-Based Large Covariance Matrix Estimator with High Frequency Data , 2015 .

[66]  Nikolaus Hautsch,et al.  Estimating the Quadratic Covariation Matrix from Noisy Observations: Local Method of Moments and Efficiency , 2013, 1303.6146.

[67]  Jianqing Fan,et al.  Large covariance estimation by thresholding principal orthogonal complements , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[68]  Seung C. Ahn,et al.  Eigenvalue Ratio Test for the Number of Factors , 2013 .

[69]  Dacheng Xiu,et al.  Principal Component Analysis of High-Frequency Data , 2015, Journal of the American Statistical Association.

[70]  Weidong Liu,et al.  Supplement to “ Adaptive Thresholding for Sparse Covariance Matrix Estimation ” , 2011 .

[71]  R. C. Merton,et al.  AN INTERTEMPORAL CAPITAL ASSET PRICING MODEL , 1973 .

[72]  Matteo Barigozzi,et al.  Improved penalization for determining the number of factors in approximate factor models , 2010 .

[73]  Markus Pelger Large-Dimensional Factor Modeling Based on High-Frequency Observations , 2018 .

[74]  George Tauchen,et al.  Nonparametric test for a constant beta between Itô semi-martingales based on high-frequency data , 2014, 1403.0349.

[75]  Gregory Connor,et al.  A Test for the Number of Factors in an Approximate Factor Model , 1993 .

[76]  The Common Factor in Idiosyncratic Volatility: Quantitative Asset Pricing Implications , 2014 .

[77]  George Kapetanios,et al.  A Testing Procedure for Determining the Number of Factors in Approximate Factor Models With Large Datasets , 2010 .

[78]  J. Bai,et al.  Principal components estimation and identification of static factors , 2013 .

[79]  M. Hallin,et al.  Determining the Number of Factors in the General Dynamic Factor Model , 2007 .

[80]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[81]  R. Green,et al.  When Will Mean-Variance Efficient Portfolios Be Well Diversified? , 1992 .