Application of Resampling Methods to the Choice of Dimension in Principal Component Analysis

This paper investigates the problem of the choice of dimension in Principal Component Analysis (PCA). PCA is introduced as a model; a loss function assessing the stability of the fit is considered. The choice of dimension then amounts to the minimisation of an expected loss which has to be estimated. This is achieved by resampling methods. Different bootstrap and jackknife estimates are presented. The behaviour of these estimates are investigated on artificial data and on real data. The resulting choices are confronted with those given by naive rules.