Stability of nonlinear principal components analysis: an empirical study using the balanced bootstrap.

Principal components analysis (PCA) is used to explore the structure of data sets containing linearly related numeric variables. Alternatively, nonlinear PCA can handle possibly nonlinearly related numeric as well as nonnumeric variables. For linear PCA, the stability of its solution can be established under the assumption of multivariate normality. For nonlinear PCA, however, standard options for establishing stability are not provided. The authors use the nonparametric bootstrap procedure to assess the stability of nonlinear PCA results, applied to empirical data. They use confidence intervals for the variable transformations and confidence ellipses for the eigenvalues, the component loadings, and the person scores. They discuss the balanced version of the bootstrap, bias estimation, and Procrustes rotation. To provide a benchmark, the same bootstrap procedure is applied to linear PCA on the same data. On the basis of the results, the authors advise using at least 1,000 bootstrap samples, using Procrustes rotation on the bootstrap results, examining the bootstrap distributions along with the confidence regions, and merging categories with small marginal frequencies to reduce the variance of the bootstrap results.

[1]  J. Leeuw,et al.  Use of the Multinomial Jackknife and Bootstrap in Generalized Nonlinear Canonical Correlation Analysis. Research Report 87-4. , 1987 .

[2]  Willem J. Heiser,et al.  Principal Components Analysis With Nonlinear Optimal Scaling Transformations for Ordinal and Nominal Data , 2005 .

[3]  I. Jolliffe,et al.  Nonlinear Multivariate Analysis , 1992 .

[4]  M. Hill,et al.  Nonlinear Multivariate Analysis. , 1990 .

[5]  R. M. Durand,et al.  Approximating Confidence Intervals for Factor Loadings. , 1991, Multivariate behavioral research.

[6]  Wayne F. Velicer,et al.  A Comparison of the Stability of Factor Analysis, Principal Component Analysis, and Rescaled Image Analysis , 1974 .

[7]  Age K Smilde,et al.  Estimating confidence intervals for principal component loadings: a comparison between the bootstrap and asymptotic results. , 2007, The British journal of mathematical and statistical psychology.

[8]  Willem J. Heiser,et al.  Constrained Multidimensional Scaling, Including Confirmation , 1983 .

[9]  P. Rousseeuw,et al.  The Bagplot: A Bivariate Boxplot , 1999 .

[10]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[11]  B. Efron The jackknife, the bootstrap, and other resampling plans , 1987 .

[12]  T. W. Anderson ASYMPTOTIC THEORY FOR PRINCIPAL COMPONENT ANALYSIS , 1963 .

[13]  P. Schönemann,et al.  A generalized solution of the orthogonal procrustes problem , 1966 .

[14]  M. A. Girshick On the Sampling Theory of Roots of Determinantal Equations , 1939 .

[15]  A. Buja,et al.  Remarks on Parallel Analysis. , 1992, Multivariate behavioral research.

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[18]  D. Vandell Characteristics of infant child care: Factors contributing to positive caregiving ☆: NICHD early child care research network , 1996 .

[19]  Joe Whittaker,et al.  Application of the Parametric Bootstrap to Models that Incorporate a Singular Value Decomposition , 1995 .

[20]  Monica Th. Markus,et al.  Bootstrap confidence regions in nonlinear multivariate analysis , 1994 .

[21]  J. I The Design of Experiments , 1936, Nature.

[22]  Brian Everitt,et al.  Homogeneity analysis of incomplete data , 1986 .

[23]  Jacqueline J Meulman,et al.  Nonlinear principal components analysis: introduction and application. , 2007, Psychological methods.

[24]  W. Velicer,et al.  Affects of variable and subject sampling on factor pattern recovery. , 1998 .

[25]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[26]  David Kaplan,et al.  The Sage handbook of quantitative methodology for the social sciences , 2004 .

[27]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[28]  N. Cliff Orthogonal rotation to congruence , 1966 .

[29]  Bootstrap confidence intervals: Good or bad? , 1988 .