Greedy kernel PCA for training data reduction and nonlinear feature extraction in classification

The aim of this paper applies greedy kernel principal component analysis (greedy kernel PCA) to deal with training data reduction and nonlinear feature extraction in classification. Kernel PCA is a nonlinear extension of linear PCA. It shows a powerful nonlinear feature extraction technique via kernel trick. A disadvantage of kernel PCA, however, is that the storage of training data in terms of the dot products is too expensive since the size of kernel matrix increases quadratically with the number of training data. So, a more efficient feature extraction method, greedy kernel PCA, is proposed to reduce training data and nonlinear feature extraction for classification. The reduced set method aims to find a new kernel expansion and well approximates the original training data. Simulation results show both kernel PCA and greedy kernel PCA are more superior to linear PCA in feature extraction. Greedy kernel PCA will tend towards kernel PCA feature extraction as more percentage of training data is included in the reduced set, whilst greedy kernel PCA results in lower evaluation cost due to the reduced training set. The experiments show also that greedy kernel PCA can significantly reduce the complexity while retaining their accuracy in classification.