Estimation of high-dimensional factor models and its application in power data analysis

In dealing with high-dimensional data, factor models are often used for reducing dimensions and extracting relevant information. The spectrum of covariance matrices from power data exhibits two aspects: 1) bulk, which arises from random noise or fluctuations and 2) spikes, which represents factors caused by anomaly events. In this paper, we propose a new approach to the estimation of high-dimensional factor models, minimizing the distance between the empirical spectral density (ESD) of covariance matrices of the residuals of power data that are obtained by subtracting principal components and the limiting spectral density (LSD) from a multiplicative covariance structure model. The free probability techniques in random matrix theory (RMT) are used to calculate the spectral density of the multiplicative covariance model, which efficiently solves the computational difficulties. The proposed approach connects the estimation of the number of factors to the LSD of covariance matrices of the residuals, which provides estimators of the number of factors and the correlation structure information in the residuals. Considering a lot of measurement noise is contained in power data and the correlation structure is complex for the residuals from power data, the approach prefers approaching the ESD of covariance matrices of the residuals through a multiplicative covariance model, which avoids making crude assumptions or simplifications on the complex structure of the data. Theoretical studies show the proposed approach is robust to noise and sensitive to the presence of weak factors. The synthetic data from IEEE 118-bus power system is used to validate the effectiveness of the approach. Furthermore, the application to the analysis of the real-world online monitoring data in a power grid shows that the estimators in the approach can be used to indicate the system states.

[1]  R. Speicher Multiplicative functions on the lattice of non-crossing partitions and free convolution , 1994 .

[2]  George Kapetanios,et al.  A Testing Procedure for Determining the Number of Factors in Approximate Factor Models With Large Datasets , 2010 .

[3]  Matthew Harding,et al.  Estimating the Number of Factors in Large Dimensional Factor Models 1 , 2013 .

[4]  R D Zimmerman,et al.  MATPOWER: Steady-State Operations, Planning, and Analysis Tools for Power Systems Research and Education , 2011, IEEE Transactions on Power Systems.

[5]  Arthur Lewbel,et al.  The Rank of Demand Systems: Theory and Nonparametric Estimation , 1991 .

[6]  Lucrezia Reichlin,et al.  Let's Get Real: A Factor Analytical Approach to Disaggregated Business Cycle Dynamics , 1998 .

[7]  George Papanicolaou,et al.  Random matrix approach to estimation of high-dimensional factor models , 2016, 1611.05571.

[8]  Seung C. Ahn,et al.  Eigenvalue Ratio Test for the Number of Factors , 2013 .

[9]  Alexandru Nica,et al.  Free random variables , 1992 .

[10]  V. Marčenko,et al.  DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .

[11]  F. Dias,et al.  Determining the number of factors in approximate factor models with global and group-specific factors , 2008 .

[12]  Stephen G. Donald,et al.  Inferring the rank of a matrix , 1997 .

[13]  J. Stock,et al.  Forecasting Using Principal Components From a Large Number of Predictors , 2002 .

[14]  A. Onatski Determining the Number of Factors from Empirical Distribution of Eigenvalues , 2010, The Review of Economics and Statistics.

[15]  Bri-Mathias Hodge,et al.  An Extended IEEE 118-Bus Test System With High Renewable Penetration , 2018, IEEE Transactions on Power Systems.

[16]  Gregory Connor,et al.  A Test for the Number of Factors in an Approximate Factor Model , 1993 .

[17]  G. Kapetanios A New Method for Determining the Number of Factors in Factor Models with Large Datasets , 2004 .

[18]  Roland Speicher,et al.  Free Probability and Random Matrices , 2014, 1404.3393.