A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine

Abstract Recently, support vector machine (SVM) has become a popular tool in time series forecasting. In developing a successful SVM forecastor, the first step is feature extraction. This paper proposes the applications of principal component analysis (PCA), kernel principal component analysis (KPCA) and independent component analysis (ICA) to SVM for feature extraction. PCA linearly transforms the original inputs into new uncorrelated features. KPCA is a nonlinear PCA developed by using the kernel method. In ICA, the original inputs are linearly transformed into features which are mutually statistically independent. By examining the sunspot data, Santa Fe data set A and five real futures contracts, the experiment shows that SVM by feature extraction using PCA, KPCA or ICA can perform better than that without feature extraction. Furthermore, among the three methods, there is the best performance in KPCA feature extraction, followed by ICA feature extraction.

[1]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[2]  CottrellM.,et al.  Neural modeling for time series , 1995 .

[3]  Andrzej Cichocki,et al.  Kernel PCA for Feature Extraction and De-Noising in Nonlinear Regression , 2001, Neural Computing & Applications.

[4]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[5]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[6]  C. Fyfe,et al.  Generalised independent component analysis through unsupervised learning with emergent Bussgang properties , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[7]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[8]  Francis Eng Hock Tay,et al.  Modified support vector machines in financial time series forecasting , 2002, Neurocomputing.

[9]  Gunnar Rätsch,et al.  Using support vector machines for time series prediction , 1999 .

[10]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[11]  F. Tay,et al.  Application of support vector machines in financial time series forecasting , 2001 .

[12]  Marie Cottrell,et al.  Neural modeling for time series: A statistical stepwise method for weight elimination , 1995, IEEE Trans. Neural Networks.

[13]  Francis Eng Hock Tay,et al.  Improved financial time series forecasting by combining Support Vector Machines with self-organizing feature map , 2001, Intell. Data Anal..

[14]  Erkki Oja,et al.  The nonlinear PCA learning rule in independent component analysis , 1997, Neurocomputing.

[15]  E. Oja,et al.  Independent Component Analysis , 2013 .

[16]  Erkki Oja,et al.  An Experimental Comparison of Neural Algorithms for Independent Component Analysis and Blind Separation , 1999, Int. J. Neural Syst..

[17]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[18]  Francis Eng Hock Tay,et al.  A comparative study of saliency analysis and genetic algorithm for feature selection in support vector machines , 2001, Intell. Data Anal..

[19]  Gil-Jin Jang,et al.  Feature vector transformation using independent component analysis and its application to speaker identification , 1999, EUROSPEECH.

[21]  P O Hoyer,et al.  Independent component analysis applied to feature extraction from colour and stereo images , 2000, Network.

[22]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[23]  Gunnar Rätsch,et al.  Predicting Time Series with Support Vector Machines , 1997, ICANN.

[24]  F. Girosi,et al.  Nonlinear prediction of chaotic time series using support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[25]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[26]  David E. Rumelhart,et al.  Predicting the Future: a Connectionist Approach , 1990, Int. J. Neural Syst..

[27]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[28]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[29]  Erkki Oja,et al.  A class of neural networks for independent component analysis , 1997, IEEE Trans. Neural Networks.

[30]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[31]  I. Jolliffe Principal Component Analysis , 2002 .

[32]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[33]  Bernardo A. Huberman,et al.  Predicting the Future , 2003, Inf. Syst. Frontiers.