Forecasting foreign exchange rates using kernel methods

First, the all-important no free lunch theorems are introduced. Next, kernel methods, support vector machines (SVMs), preprocessing, model selection, feature selection, SVM software and the Fisher kernel are introduced and discussed. A hidden Markov model is trained on foreign exchange data to derive a Fisher kernel for an SVM, the DC algorithm and the Bayes point machine (BPM) are also used to learn the kernel on foreign exchange data. Further, the DC algorithm was used to learn the parameters of the hidden Markov model in the Fisher kernel, creating a hybrid algorithm. The mean net returns were positive for BPM; and BPM, the Fisher kernel, the DC algorithm and the hybrid algorithm were all improvements over a standard SVM in terms of both gross returns and net returns, but none achieved net returns as high as the genetic programming approach employed by Neely, Weller, and Dittmar (1997) and published in Neely, Weller, and Ulrich (2009). Two implementations of SVMs for Windows with semi-automated parameter selection are built.

[1]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[2]  Marti A. Hearst Intelligent Connections: Battling with GA-Joe. , 1998 .

[3]  Cullen Schaffer,et al.  A Conservation Law for Generalization Performance , 1994, ICML.

[4]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[5]  David Hume,et al.  A Treatise of Human Nature. Being an Attempt to Introduce the Experimental Method of Reasoning Into Moral Subjects. Edited with an Introd. By D.G.C. Macnabb. , 1972 .

[6]  R. A. Silverman,et al.  Introductory Real Analysis , 1972 .

[7]  E. Fama,et al.  Filter Rules and Stock-Market Trading , 1966 .

[8]  Christopher J. Neely,et al.  Technical Analysis and Central Bank Intervention , 2000 .

[9]  Lain L. MacDonald,et al.  Hidden Markov and Other Models for Discrete- valued Time Series , 1997 .

[10]  Stephen L Taylor,et al.  Trading futures using a channel rule: A study of the predictive power of technical analysis with currency examples , 1994 .

[11]  Charles A. Micchelli,et al.  A DC-programming algorithm for kernel selection , 2006, ICML.

[12]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[13]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[15]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[16]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[17]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[18]  Colin Campbell,et al.  Bayes Point Machines , 2001, J. Mach. Learn. Res..

[19]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[20]  Christopher J. Neely Technical analysis in the foreign exchange market: a layman's guide , 1997 .

[21]  Christopher J. Neely,et al.  The Adaptive Markets Hypothesis: Evidence from the Foreign Exchange Market , 2009, Journal of Financial and Quantitative Analysis.

[22]  M. Peruggia Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (2nd ed.) , 2003 .

[23]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[24]  R. Sweeney,et al.  Beating the Foreign Exchange Market , 1986 .

[25]  B.-H. Juang,et al.  Maximum-likelihood estimation for mixture multivariate stochastic observations of Markov chains , 1985, AT&T Technical Journal.

[26]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[27]  Christopher J. Neely,et al.  Is Technical Analysis in the Foreign Exchange Market Profitable? A Genetic Programming Approach , 1996, Journal of Financial and Quantitative Analysis.

[28]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[29]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[30]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[31]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[32]  Christopher J. Neely,et al.  Can Markov Switching Models Predict Excess Foreign Exchange Returns? , 2006 .

[33]  Cheolbeom Park,et al.  The Profitability of Technical Analysis: A Review , 2004 .

[34]  Eric Moulines,et al.  Inference in Hidden Markov Models (Springer Series in Statistics) , 2005 .

[35]  Nils Lid Hjort,et al.  Model Selection and Model Averaging , 2001 .

[36]  John B. Moore,et al.  Hidden Markov Models: Estimation and Control , 1994 .

[37]  R. Horst,et al.  DC Programming: Overview , 1999 .

[38]  Timothy Masters,et al.  Neural, Novel & Hybrid Algorithms for Time Series Prediction , 1995 .

[39]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[40]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[41]  J. Kemeny,et al.  Denumerable Markov chains , 1969 .

[42]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[43]  Christopher J. Neely Technical analysis and the profitability of U.S. foreign exchange intervention , 1998 .

[44]  Tom M. Mitchell,et al.  The Need for Biases in Learning Generalizations , 2007 .

[45]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[46]  Sean R Eddy,et al.  What is a hidden Markov model? , 2004, Nature Biotechnology.

[47]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[48]  A. Poritz,et al.  Hidden Markov models: a guided tour , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[49]  S. Bernstein Sur l'extension du théoréme limite du calcul des probabilités aux sommes de quantités dépendantes , 1927 .

[50]  A. Lo The Statistics of Sharpe Ratios , 2002 .

[51]  R. Bhar,et al.  Hidden Markov Models: Applications to Financial Economics , 2004 .