Variable selection in multivariate methods using global score estimation

A variable selection method using global score estimation is proposed, which is applicable as a selection criterion in any multivariate method without external variables such as principal component analysis, factor analysis and correspondence analysis. This method selects a subset of variables by which we approximate the original global scores as much as possible in the context of least squares, where the global scores, e.g. principal component scores, factor scores and individual scores, are computed based on the selected variables. Global scores are usually orthogonal. Therefore, the estimated global scores should be restricted to being mutually orthogonal. According to how to satisfy that restriction, we propose three computational steps to estimate the scores. Example data is analyzed to demonstrate the performance and usefulness of the proposed method, in which the proposed algorithm is evaluated and the results obtained using four cost-saving selection procedures are compared. This example shows that combining these steps and procedures yields more accurate results quickly.

[1]  Ian T. Jolliffe,et al.  Discarding Variables in a Principal Component Analysis. I: Artificial Data , 1972 .

[2]  Manabu Suzuki A BINOMIAL ERROR MODEL WITH A TRUNCATED LATENT DISTRIBUTION AND ITS APPLICATION , 1983 .

[3]  P. Robert,et al.  A Unifying Tool for Linear Multivariate Statistical Methods: The RV‐Coefficient , 1976 .

[4]  Yutaka Tanaka,et al.  Principal component analysis based on a subset of variables: variable selection and sensitivity analysis , 1997 .

[5]  Takakazu Sugiyama,et al.  ON THE NUMERICAL COMPUTATION OF CONFLUENT HYPERGEOMETRIC FUNCTION WITH ZONAL POLYNOMIALS OF ORDER 3 , 1998 .

[6]  A METHOD OF VARIABLE SELECTION IN HAYASHI'S THIRD METHOD OF QUANTIFICATION , 1988 .

[7]  Yuichi Mori,et al.  Statistical Software VASMM for Variable Selection in Multivariate Methods , 2002, COMPSTAT.

[8]  Antoine de Falguerolles,et al.  Un critère de choix de variables en analyse en composantes principales fondé sur des modèles graphiques gaussiens particuliers , 1993 .

[9]  I. Jolliffe Discarding Variables in a Principal Component Analysis. Ii: Real Data , 1973 .

[10]  Mori Yuichi,et al.  Statistical Software VASPCA : Variable Selection in PCA , 1997 .

[11]  Y. Escoufier,et al.  Choix de variables en analyse en composantes principales , 1984 .

[12]  Akira Harada,et al.  Stepwise variable selection in factor analysis , 2000 .

[13]  Yutaka Tanaka,et al.  A Method of Variable Selection in Factor Analysis and its Numerical Investigation , 1981 .

[14]  W. Krzanowski Selection of Variables to Preserve Multivariate Data Structure, Using Principal Components , 1987 .

[15]  Yutaka Tanaka SOME CRITERIA FOR VARIABLE SELECTION IN FACTOR ANALYSIS , 1983 .

[16]  Wojtek J. Krzanowski,et al.  Cross-Validation in Principal Component Analysis , 1987 .