Exploratory factor analysis - Parameter estimation and scores prediction with high-dimensional data

In an approach aiming at high-dimensional situations, we first introduce a distribution-free approach to parameter estimation in the standard random factor model, that is shown to lead to the same estimating equations as maximum likelihood estimation under normality. The derivation is considerably simpler, and works equally well in the case of more variables than observations ( p n ) . We next concentrate on the latter case and show results of type: Albeit factor loadings and specific variances cannot be precisely estimated unless n is large, this is not needed for the factor scores to be precise, but only that p is large;A classical fixed point iteration method can be expected to converge safely and rapidly, provided p is large. A microarray data set, with p = 2000 and n = 22 , is used to illustrate this theoretical result. We treat the case of more variables than observations ( p n ) in the standard FA model.For large p , factor scores can be estimated with high precision ( n need not be large).For large p , an old iteration method converges fast and with no inadmissible values.

[1]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[2]  K. Jöreskog Some contributions to maximum likelihood factor analysis , 1967 .

[3]  Nickolay T. Trendafilov,et al.  A majorization algorithm for simultaneous parameter estimation in robust exploratory factor analysis , 2010, Comput. Stat. Data Anal..

[4]  A condition for the regression predictor to be consistent in a single common factor model , 1986 .

[5]  D. Lawley,et al.  XX.—Some New Results in Maximum Likelihood Factor Analysis , 1967, Proceedings of the Royal Society of Edinburgh. Section A. Mathematical and Physical Sciences.

[6]  Dorothy T. Thayer,et al.  EM algorithms for ML factor analysis , 1982 .

[7]  J. Bai,et al.  Inferential Theory for Factor Models of Large Dimensions , 2003 .

[8]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Michael J. Crawley,et al.  The R book , 2022 .

[10]  Nickolay T. Trendafilov,et al.  Exploratory Factor Analysis of Data Matrices With More Variables Than Observations , 2011 .

[11]  Geoffrey J. McLachlan,et al.  Modelling high-dimensional data by mixtures of factor analyzers , 2003, Comput. Stat. Data Anal..

[12]  H. Schneeweiß,et al.  Factor Analysis and Principal Components , 1995 .

[13]  Stan Lipovetsky,et al.  Latent Variable Models and Factor Analysis , 2001, Technometrics.

[14]  N. Trendafilov,et al.  Simultaneous Parameter Estimation in Exploratory Factor Analysis: An Expository Review , 2010 .

[15]  D. Lawley VI.—The Estimation of Factor Loadings by the Method of Maximum Likelihood , 1940 .

[16]  Geoffrey J. McLachlan,et al.  Analyzing Microarray Gene Expression Data , 2004 .

[17]  Donald Robertson,et al.  Maximum likelihood factor analysis with rank-deficient sample covariance matrices , 2007 .