A pattern selection algorithm in kernel PCA applications

Principal Component Analysis (PCA) has been extensively used in different fields including earth science for spatial pattern identification. However, the intrinsic linear feature associated with standard PCA prevents scientists from detecting nonlinear structures. Kernel-based principal component analysis (KPCA), a recently emerging technique, provides a new approach for exploring and identifying nonlinear patterns in scientific data. In this paper, we recast KPCA in the commonly used PCA notation for earth science communities and demonstrate how to apply the KPCA technique into the analysis of earth science data sets. In such applications, a large number of principal components should be retained for studying the spatial patterns, while the variance cannot be quantitatively transferred from the feature space back into the input space. Therefore, we propose a KPCA pattern selection algorithm based on correlations with a given geophysical phenomenon. We demonstrate the algorithm with two widely used data sets in geophysical communities, namely the Normalized Difference Vegetation Index (NDVI) and the Southern Oscillation Index (SOI). The results indicate the new KPCA algorithm can reveal more significant details in spatial patterns than standard PCA.

[1]  Adam H. Monahan,et al.  Nonlinear Principal Component Analysis: Tropical Indo–Pacific Sea Surface Temperature and Sea Level Pressure , 2001 .

[2]  P. Holmes,et al.  Turbulence, Coherent Structures, Dynamical Systems and Symmetry , 1996 .

[3]  J. Wallace,et al.  Annular Modes in the Extratropical Circulation. Part I: Month-to-Month Variability* , 2000 .

[4]  James R. Schott,et al.  Principles of Multivariate Analysis: A User's Perspective , 2002 .

[5]  M. Kafatos,et al.  P 1 . 5 KERNEL PCA ANALYSIS FOR REMOTE SENSING DATA , 2005 .

[6]  Thomas W. Parsons,et al.  Digital signal processing: theory, applications, and hardware , 1991 .

[7]  P. Jones,et al.  An Extension of the TahitiDarwin Southern Oscillation Index , 1987 .

[8]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[9]  Trevor F. Cox,et al.  Metric multidimensional scaling , 2000 .

[11]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[12]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[13]  M. Kafatos,et al.  Interannual Variability of Vegetation in the United States and Its Relation to El Niño/Southern Oscillation , 2000 .

[14]  A. Cracknell The advanced very high resolution radiometer , 1997 .

[15]  Catherine A. Smith,et al.  Singular value decomposition of wintertime sea surface temperature and 500-mb height anomalies , 1992 .

[16]  William J. Emery,et al.  Data Analysis Methods in Physical Oceanography , 1998 .

[17]  H. Storch,et al.  Statistical Analysis in Climate Research , 2000 .

[18]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[19]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[20]  T. Hastie,et al.  Principal Curves , 2007 .

[21]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[22]  Bernhard Schölkopf,et al.  A kernel view of the dimensionality reduction of manifolds , 2004, ICML.