Distortion function and clustering for local linear models

Principal component analysis (PCA) is a ubiquitous statistical technique for data analysis. PCA is however limited by its linearity and may sometimes be too simple for dealing with real-world data especially when the relations among variables are nonlinear. Recent years have witnessed the emergence of nonlinear generalizations of PCA, as for instance nonlinear principal component analysis (NLPCA) [1] or vector quantization principal component analysis (VQPCA) [2]. VQPCA involves a two-step procedure, namely a clustering of the data space into several regions and the application of PCA in each local region. In Ref. [3], VQPCA was applied for the reconstruction of dynamical response and it was shown that it is potentially a more effective tool than conventional PCA. The purpose of this technical note is to further investigate VQPCA and to have a closer look at the choice of the distortion function used for clustering the data space.