Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models

By means of factor analysis (FA) or principal components analysis (PCA) a matrix Y with the elements y ik is approximated by the model Here the parameters α, β and θ express the systematic part of the data yik, “signal,” and the residuals ∊ ik express the “random” part, “noise.” When applying FA or PCA to a matrix of real data obtained, for example, by characterizing N chemical mixtures by M measured variables, one major problem is the estimation of the rank A of the matrix Y, i.e. the estimation of how much of the data y ik is “signal” and how much is “noise.” Cross validation can be used to approach this problem. The matrix Y is partitioned and the rank A is determined so as to maximize the predictive properties of model (I) when the parameters are estimated on one part of the matrix Y and the prediction tested on another part of the matrix Y.

[1]  R. Fisher,et al.  STUDIES IN CROP VARIATION , 2009 .

[2]  T. W. Anderson,et al.  An Introduction to Multivariate Statistical Analysis , 1959 .

[3]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[4]  H. Harman Modern factor analysis , 1961 .

[5]  F. Mosteller,et al.  Inference in an Authorship Problem , 1963 .

[6]  R. Rummel Applied Factor Analysis , 1970 .

[7]  W. O. McReynolds,et al.  Characterization of Some Liquid Phases , 1970 .

[8]  Second Edition,et al.  Statistical Package for the Social Sciences , 1970 .

[9]  J. Mandel A New Analysis of Variance Model for Non-additive Data , 1971 .

[10]  Franklin A. Graybill,et al.  An Analysis of a Two-Way Model with Interaction and No Replication , 1972 .

[11]  P. Weiner,et al.  Factor analysis of some chemical and physical influences in gas-liquid chromatography , 1972 .

[12]  P. Weiner,et al.  A study of structure-activity relationships of a series of diphenylaminopropanols by factor analysis. , 1973, Journal of medicinal chemistry.

[13]  Svante Wold,et al.  Major components influencing retention indices in gas chromatography , 1973 .

[14]  E. A. Sylvestre,et al.  Curve Resolution Using a Postulated Chemical Reaction , 1974 .

[15]  S. Geisser A predictive approach to the random effect model , 1974 .

[16]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[17]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[18]  G. Wahba,et al.  A completely automatic french curve: fitting spline functions by cross validation , 1975 .

[19]  G. Wahba,et al.  Periodic splines for spectral density estimation: the use of cross validation for determining the degree of smoothing , 1975 .

[20]  R. W. Rozett,et al.  Methods of factor analysis of mass spectra , 1975 .

[21]  F. N. David,et al.  Geological Factor Analysis , 1976 .

[22]  Dallas Johnson,et al.  On Analyzing Two-Way AoV Data with Interaction , 1976 .

[23]  Svante Wold,et al.  Pattern recognition by means of disjoint principal components models , 1976, Pattern Recognit..