A critical feature extraction by kernel PCA in stock trading model

This paper presents a kernel-based principal component analysis (kernel PCA) to extract critical features for improving the performance of a stock trading model. The feature extraction method is one of the techniques to solve dimensionality reduction problems (DRP). The kernel PCA is a feature extraction approach which has been applied to data transformation from known variables to capture critical information. The kernel PCA is a kernel-based data mapping tool that has characteristics of both principal component analysis and non-linear mapping. The feature selection method is another DRP technique that selects only a small set of features from known variables, but these features still indicate possible collinearity problems that fail to reflect clear information. However, most feature extraction methods use a variable mapping application to eliminate noisy and collinear variables. In this research, we use the kernel-PCA method in a stock trading model to transform stock technical indices (TI) which allows features of smaller dimension to be formed. The kernel-PCA method has been applied to various stocks and sliding window testing methods using both half-year and 1-year testing strategies. The experimental results show that the proposed method generates more profits than other DRP methods on the America stock market. This stock trading model is very practical for real-world application, and it can be implemented in a real-time environment.

[1]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[2]  Daniel Rivero,et al.  Automatic feature extraction using genetic programming: An application to epileptic EEG classification , 2011, Expert Syst. Appl..

[3]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[4]  Chris H. Q. Ding,et al.  Adaptive dimension reduction for clustering high dimensional data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[5]  Pei-Chann Chang,et al.  Emotion classification by removal of the overlap from incremental association language features , 2011 .

[6]  Heiko Hoffmann,et al.  Kernel PCA for novelty detection , 2007, Pattern Recognit..

[7]  Feiping Nie,et al.  Nonlinear dimensionality reduction with relative distance comparison , 2009, Neurocomputing.

[8]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, International Conference on Artificial Neural Networks.

[9]  Sun-Yuan Kung,et al.  Principal Component Neural Networks: Theory and Applications , 1996 .

[10]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[11]  P O Hoyer,et al.  Independent component analysis applied to feature extraction from colour and stereo images , 2000, Network.

[12]  Liu Quan,et al.  Financial time series forecasting using LPP and SVM optimized by PSO , 2013, SOCO 2013.

[13]  Norman R. Draper,et al.  Applied regression analysis (2. ed.) , 1981, Wiley series in probability and mathematical statistics.

[14]  I. Jolliffe Principal Component Analysis , 2002 .

[15]  Francisco Herrera,et al.  On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection , 2012, Soft Computing.

[16]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[17]  Pei-Chann Chang,et al.  Myocardial infarction classification with multi-lead ECG using hidden Markov models and Gaussian mixture models , 2012, Appl. Soft Comput..

[18]  Jianhua Dai,et al.  Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification , 2013, Appl. Soft Comput..

[19]  Chih-Fong Tsai,et al.  Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches , 2010, Decis. Support Syst..

[20]  Rosangela Ballini,et al.  Top-down strategies based on adaptive fuzzy rule-based systems for daily time series forecasting , 2011 .

[21]  Zhenyu Liu,et al.  A method of SVM with Normalization in Intrusion Detection , 2011 .

[22]  Zehong Yang,et al.  Intelligent stock trading system based on improved technical analysis and Echo State Network , 2011, Expert Syst. Appl..

[23]  Yanqing Zhang,et al.  A genetic algorithm-based method for feature subset selection , 2008, Soft Comput..

[24]  P. Chang,et al.  A collaborative trading model by support vector regression and ts fuzzy rule for daily stock turni , 2011 .

[25]  Joachim Selbig,et al.  Non-linear PCA: a missing data approach , 2005, Bioinform..

[26]  Ryuichi Yamamoto Intraday technical analysis of individual stocks on the Tokyo Stock Exchange , 2012 .

[27]  Zi Huang,et al.  Self-taught dimensionality reduction on the high-dimensional small-sized data , 2013, Pattern Recognit..

[28]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[29]  Todd C. Rasmussen,et al.  Advances in variable selection methods I: Causal selection methods versus stepwise regression and principal component analysis on data of known and unknown functional relationships , 2012 .

[30]  Pei-Chann Chang,et al.  A Trend-Based Segmentation Method and the Support Vector Regression for Financial Time Series Forecasting , 2012 .

[31]  Pei-Chann Chang,et al.  A dynamic threshold decision system for stock trading signal detection , 2011, Appl. Soft Comput..

[32]  Asif Ekbal,et al.  Combining feature selection and classifier ensemble using a multiobjective simulated annealing approach: application to named entity recognition , 2012, Soft Computing.

[33]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[34]  Tsai-Hung Fan,et al.  Tests and variables selection on regression analysis for massive datasets , 2007, Data Knowl. Eng..