Neural Network Learning for Time-Series Predictions Using Constrained Formulations

In this thesis, we propose a recurrent FIR neural network, develop a constrained formulation for neural network learning, study an efficient violation guided backpropagation algorithm for solving the constrained formulation based on the theory of extended saddle points, and apply neural network learning for predicting both noise-free time series and noisy time series. The recurrent FIR neural-network architecture combines a recurrent structure and a memory-based FIR structure in order to provide a more powerful modeling ability. The constrained formulation for neural network learning incorporates the error of each learning pattern as a constraint, a new cross-validation scheme that allows multiple validations sets to be considered in learning, and new constraints that can be expressed in a procedure form. The violation-guided back propagation algorithm first transforms the constrained formulation into an 11-penalty function, and searches for a saddle point of the penalty function. When using a constrained formulation along with violation guided backpropagation to neural network learning for near noiseless time-series benchmarks, we achieve much improved prediction performance as compared to that of previous work, while using less parameters. For noisy time-series, such as financial time series, we have studied systematically trade-offs between denoising and information preservation, and have proposed three preprocessing techniques for time-series with high-frequency noise. In particular, we have proposed a novel approach by first decomposing a noisy time series into different frequency channels and by preprocessing each channel adaptively according to its level of noise. We incorporate constraints on predicting low-pass data in the lag period when a low-pass filter is employed to denoise the band. The new constraints enable active training in the lag period that greatly improves the prediction accuracy in the lag period. Extensive prediction experiments on financial time series have been conducted to exploit the modeling ability of neural networks, and promising results have been obtained.

[1]  Dick van Dijk,et al.  Forecasting industrial production with linear, nonlinear, and structural change models , 2003 .

[2]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[3]  M. Clyde,et al.  Multiple shrinkage and subset selection in wavelets , 1998 .

[4]  Benjamin W. Wah,et al.  Violation-Guided Learning for Constrained Formulations in Neural-Network Time-Series Predictions , 2001, IJCAI.

[5]  Henry D. I. Abarbanel,et al.  Analysis of Observed Chaotic Data , 1995 .

[6]  David E. Rumelhart,et al.  Predicting the Future: a Connectionist Approach , 1990, Int. J. Neural Syst..

[7]  F. Girosi,et al.  A Connection Between GRBF and MLP , 1992 .

[8]  Eric A. Wan,et al.  Temporal backpropagation for FIR neural networks , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[9]  Nigel Meade,et al.  A comparison of the accuracy of short term foreign exchange forecasting methods , 2002 .

[10]  L. Tsimring,et al.  The analysis of observed chaotic data in physical systems , 1993 .

[11]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[12]  James L. McClelland,et al.  Learning Subsequential Structure in Simple Recurrent Networks , 1988, NIPS.

[13]  Johannes Ledolter,et al.  Statistical methods for forecasting , 1983 .

[14]  Christopher J. C. H. Watkins,et al.  Combining Cross-Validation and Search , 1987, EWSL.

[15]  Aleksandra Pizurica,et al.  Image denoising using wavelets and spatial context modeling , 2002 .

[16]  T. Schreiber Interdisciplinary application of nonlinear time series methods , 1998, chao-dyn/9807001.

[17]  Gerald Matz,et al.  Time-frequency-autoregressive random processes: modeling and fast parameter estimation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[18]  Chris Chatfield,et al.  Time‐series forecasting , 2000 .

[19]  D. Donoho,et al.  Translation-Invariant De-Noising , 1995 .

[20]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[21]  Michael I. Jordan,et al.  Hidden Markov Decision Trees , 1996, NIPS.

[22]  Jelena Kovacevic,et al.  Wavelets and Subband Coding , 2013, Prentice Hall Signal Processing Series.

[23]  Klaus-Robert Müller,et al.  Analysis of Drifting Dynamics with Competing Predictors , 1996, ICANN.

[24]  Michel Barlaud,et al.  Image coding using wavelet transform , 1992, IEEE Trans. Image Process..

[25]  Andreas S. Weigend,et al.  Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .

[26]  D. B. Preston Spectral Analysis and Time Series , 1983 .

[27]  N. Christophersen,et al.  Chaotic time series , 1995 .

[28]  William L. Goffe,et al.  SIMANN: FORTRAN module to perform Global Optimization of Statistical Functions with Simulated Annealing , 1992 .

[29]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[30]  Chris Chatfield,et al.  Holt‐Winters Forecasting: Some Practical Issues , 1988 .

[31]  A. Harvey Time series models , 1983 .

[32]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[33]  Mordecai Avriel,et al.  Nonlinear programming , 1976 .

[34]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  I. Johnstone,et al.  Wavelet Shrinkage: Asymptopia? , 1995 .

[36]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[37]  Stephen L Taylor,et al.  MODELING STOCHASTIC VOLATILITY: A REVIEW AND COMPARATIVE STUDY , 1994 .

[38]  Jean-Paul Haton,et al.  Neural networks for speech recognition , 1996 .

[39]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[40]  Benito E. Flores,et al.  The use of an expert system in the M3 competition , 2000 .

[41]  B. Silverman,et al.  The Stationary Wavelet Transform and some Statistical Applications , 1995 .

[42]  Zoran Obradovic,et al.  Regime signaling techniques for non-stationary time series forecasting , 1997, Proceedings of the Thirtieth Hawaii International Conference on System Sciences.

[43]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[44]  Robert L. Winkler,et al.  The accuracy of extrapolation (time series) methods: Results of a forecasting competition , 1982 .

[45]  Michael L. Littman,et al.  Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.

[46]  Yixin Chen,et al.  Subgoal Partitioning and Global Search for Solving Temporal Planning Problems in Mixed Space , 2004, Int. J. Artif. Intell. Tools.

[47]  Hervé Bourlard,et al.  Continuous speech recognition using multilayer perceptrons with hidden Markov models , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[48]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[49]  Fionn Murtagh,et al.  Dynamical recurrent neural networks -- towards environmental time series prediction , 1995, Int. J. Neural Syst..

[50]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[51]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[52]  Amir B. Geva,et al.  ScaleNet-multiscale neural-network architecture for time series prediction , 1998, IEEE Trans. Neural Networks.

[53]  Andrew Lippman,et al.  Entropy measures for controlled coding , 1996, Electronic Imaging.

[54]  Klaus-Robert Müller,et al.  Analysis of Drifting Dynamics with Neural Network Hidden Markov Models , 1997, NIPS.

[55]  R. V. Sachs Modelling and Estimation of the Time--varying Structure of Nonstationary Time Series , 1996 .

[56]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[57]  J. Sprott Strange Attractors: Creating Patterns in Chaos , 1993 .

[58]  E. S. Gardner,et al.  Forecasting Trends in Time Series , 1985 .

[59]  P. Newbold Some recent developments in time series analysis. III , 1988 .

[60]  M. Kendall,et al.  The advanced theory of statistics , 1945 .

[61]  I. Daubechies Orthonormal bases of compactly supported wavelets , 1988 .

[62]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[63]  Claas de Groot,et al.  Analysis of univariate time series with connectionist nets: A case study of two classical examples , 1991, Neurocomputing.

[64]  Spyros Makridakis,et al.  The M3-Competition: results, conclusions and implications , 2000 .

[65]  Eric A. Wan,et al.  Finite impulse response neural networks with applications in time series prediction , 1994 .

[66]  Mohamad T. Musavi,et al.  On the implementation of RBF technique in neural networks , 1991, ANNA '91.

[67]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[68]  Timothy Masters,et al.  Neural, Novel & Hybrid Algorithms for Time Series Prediction , 1995 .

[69]  F. Murtagh,et al.  The Wavelet Transform in Multivariate Data Analysis , 1996 .

[70]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[71]  Benjamin W. Wah,et al.  Global Optimization for Neural Network Training , 1996, Computer.

[72]  M. Stone Cross-validation:a review 2 , 1978 .

[73]  M. Paluvs,et al.  Estimating Predictability: Redundancy and Surrogate Data Method , 1995, comp-gas/9507003.

[74]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[75]  Gunnar Rätsch,et al.  Predicting Time Series with Support Vector Machines , 1997, ICANN.

[77]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[78]  Benjamin W. Wah,et al.  Constrained formulations and algorithms for stock-price predictions using recurrent FIR neural networks , 2002, AAAI/IAAI.

[79]  A. Willsky Multiresolution Markov models for signal and image processing , 2002, Proc. IEEE.

[80]  G. C. Tiao,et al.  Some advances in non‐linear and adaptive modelling in time‐series , 1994 .

[81]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[82]  Spyros Makridakis,et al.  Forecasting Methods for Management , 1989 .

[83]  Jeffrey S. Racine,et al.  Entropy and predictability of stock market returns , 2002 .

[84]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[85]  Philip Hans Franses,et al.  The forecasting performance of various models for seasonality and nonlinearity for quarterly industrial production , 2001 .

[86]  Celso Grebogi,et al.  Using small perturbations to control chaos , 1993, Nature.

[87]  Zhe Wu,et al.  The Theory of Discrete Lagrange Multipliers for Nonlinear Discrete Optimization , 1999, CP.

[88]  Benjamin W. Wah,et al.  Violation-Guided Neural-Network Learning for Constrained Formulations in Time-Series Predictions , 2001, Int. J. Comput. Intell. Appl..

[89]  Jukka Saarinen,et al.  Time Series Prediction with Multilayer Perception, FIR and Elman Neural Networks , 1996 .

[90]  Masanao Aoki,et al.  State Space Modeling of Time Series , 1987 .

[91]  Andrew Harvey,et al.  Forecasting, Structural Time Series Models and the Kalman Filter , 1990 .

[92]  Dominik R. Dersch,et al.  Multiresolution Forecasting for Futures Trading , 2001 .

[93]  David G. Luenberger,et al.  Linear and nonlinear programming , 1984 .

[94]  M. B. Priestley,et al.  Non-linear and non-stationary time series analysis , 1990 .

[95]  Wolfram Schiffmann,et al.  Comparison of optimized backpropagation algorithms , 1993, ESANN.

[96]  C. Lee Giles,et al.  An experimental comparison of recurrent neural networks , 1994, NIPS.

[97]  Everette S. Gardner,et al.  Exponential smoothing: The state of the art , 1985 .

[98]  Steven C. Wheelwright,et al.  Forecasting methods and applications. , 1979 .

[99]  H. Tong Non-linear time series. A dynamical system approach , 1990 .

[100]  Benjamin W. Wah,et al.  Constraint-Based Neural Network Learning for Time Series Predictions , 2004 .

[101]  Douglas H. Fisher,et al.  Conceptual Clustering, Learning from Examples, and Inference , 1987 .

[102]  Yixin Chen,et al.  SGPlan: Subgoal Partitioning and Resolution in Planning , 2004 .

[103]  Guy Melard,et al.  Automatic ARIMA modeling including interventions, using time series expert software , 2000 .

[104]  Michael I. Jordan,et al.  Reinforcement Learning by Probability Matching , 1995, NIPS 1995.

[105]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[106]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[107]  C. Granger,et al.  An introduction to bilinear time series models , 1979 .

[108]  M. J. D. Powell,et al.  Radial basis functions for multivariable interpolation: a review , 1987 .

[109]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[110]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[111]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[112]  Klaus-Robert Müller,et al.  Analysis of switching dynamics with competing neural networks , 1995 .

[113]  David J. Goodman,et al.  Personal Communications , 1994, Mobile Communications.

[114]  James M. Hutchinson,et al.  A radial basis function approach to financial time series analysis , 1993 .

[115]  Yixin Chen,et al.  Partitioning of temporal planning problems in mixed space using the theory of extended saddle points , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[116]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[117]  Alex Aussem,et al.  Dynamical recurrent neural networks towards prediction and modeling of dynamical systems , 1999, Neurocomputing.