Bayesian neural networks for nonlinear time series forecasting

In this article, we apply Bayesian neural networks (BNNs) to time series analysis, and propose a Monte Carlo algorithm for BNN training. In addition, we go a step further in BNN model selection by putting a prior on network connections instead of hidden units as done by other authors. This allows us to treat the selection of hidden units and the selection of input variables uniformly. The BNN model is compared to a number of competitors, such as the Box-Jenkins model, bilinear model, threshold autoregressive model, and traditional neural network model, on a number of popular and challenging data sets. Numerical results show that the BNN model has achieved a consistent improvement over the competitors in forecasting future values. Insights on how to improve the generalization ability of BNNs are revealed in many respects of our implementation, such as the selection of input variables, the specification of prior distributions, and the treatment of outliers.

[1]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[2]  Peter Müller,et al.  Issues in Bayesian Analysis of Neural Network Models , 1998, Neural Computation.

[3]  H. Tong Non-linear time series. A dynamical system approach , 1990 .

[4]  Suh Young Kang,et al.  An investigation of the use of feedforward neural networks for forecasting , 1992 .

[5]  Dave Higdon,et al.  A Bayesian approach to characterizing uncertainty in inverse problems using coarse and fine-scale information , 2002, IEEE Trans. Signal Process..

[6]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[7]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[8]  Faming Liang,et al.  EVOLUTIONARY MONTE CARLO: APPLICATIONS TO Cp MODEL SAMPLING AND CHANGE POINT PROBLEM , 2000 .

[9]  David E. Rumelhart,et al.  Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.

[10]  G. Parisi,et al.  Simulated tempering: a new Monte Carlo scheme , 1992, hep-lat/9205018.

[11]  C. L. Mallows Some comments on C_p , 1973 .

[12]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[13]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[14]  Adrian F. M. Smith,et al.  Bayesian computation via the gibbs sampler and related markov chain monte carlo methods (with discus , 1993 .

[15]  Tom Murray,et al.  Predicting sun spots using a layered perceptron neural network , 1996, IEEE Trans. Neural Networks.

[16]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[17]  Faming Liang,et al.  Automatic Bayesian model averaging for linear regression and applications in Bayesian curve fitting , 2001 .

[18]  W. Wong,et al.  Real-Parameter Evolutionary Monte Carlo With Applications to Bayesian Mixture Models , 2001 .

[19]  Nando de Freitas,et al.  Sequential Monte Carlo Methods for Neural Networks , 2001, Sequential Monte Carlo Methods in Practice.

[20]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[21]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[22]  S. Roberts,et al.  Bayesian methods for autoregressive models , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[23]  D. Tjøstheim,et al.  Identification of nonlinear time series: First order characterization and order determination , 1990 .

[24]  William D. Penny,et al.  Bayesian neural networks for classification: how useful is the evidence framework? , 1999, Neural Networks.

[25]  William Remus,et al.  Neural Network Models for Time Series Forecasts , 1996 .

[26]  W. Härdle,et al.  Kernel regression smoothing of time series , 1992 .

[27]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[28]  S. Q. s3idChMn,et al.  Evolutionary Monte Carlo: Applications to C_p Model Sampling and Change Point Problem , 2000 .

[29]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[30]  Mahmoud M. Gabr,et al.  THE ESTIMATION AND PREDICTION OF SUBSET BILINEAR TIME SERIES MODELS WITH APPLICATIONS , 1981 .

[31]  Nando de Freitas,et al.  Sequential Monte Carlo for model selection and estimation of neural networks , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[32]  C. Mallows More comments on C p , 1995 .

[33]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[34]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[35]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[36]  Christopher Holmes,et al.  Bayesian Methods for Nonlinear Classification and Regressing , 2002 .

[37]  Philipp Slusallek,et al.  Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.

[38]  K. Hukushima,et al.  Exchange Monte Carlo Method and Application to Spin Glass Simulations , 1995, cond-mat/9512035.

[39]  J. Faraway,et al.  Time series forecasting with neural networks: a comparative study using the air line data , 2008 .

[40]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[41]  K. S. Lim A COMPARATIVE STUDY OF VARIOUS UNIVARIATE TIME SERIES MODELS FOR CANADIAN LYNX DATA , 1987 .

[42]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[43]  M. Steel,et al.  Benchmark Priors for Bayesian Model Averaging , 2001 .

[44]  T. Rao,et al.  An Introduction to Bispectral Analysis and Bilinear Time Series Models , 1984 .

[45]  Alan D. Marrs An Application of Reversible-Jump MCMC to Multivariate Spherical Gaussian Mixtures , 1997, NIPS.

[46]  M. Waldmeier The sunspot-activity in the years 1610-1960 , 1961 .

[47]  B. G. Quinn,et al.  Random Coefficient Autoregressive Models: An Introduction , 1982 .

[48]  Nando de Freitas,et al.  Reversible Jump MCMC Simulated Annealing for Neural Networks , 2000, UAI.

[49]  Craig B. Borkowf,et al.  Time-Series Forecasting , 2002, Technometrics.

[50]  Lon-Mu Liu,et al.  Forecasting time series with outliers , 1993 .

[51]  C. C. Homes,et al.  Bayesian Radial Basis Functions of Variable Dimension , 1998, Neural Computation.

[52]  R. Kohn,et al.  Diagnostics for Time Series Analysis , 1999 .

[53]  R. Kohn,et al.  Bayesian estimation of an autoregressive model using Markov chain Monte Carlo , 1996 .

[54]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[55]  David E. Rumelhart,et al.  Predicting the Future: a Connectionist Approach , 1990, Int. J. Neural Syst..

[56]  D. E. Goldberg,et al.  Genetic Algorithms in Search, Optimization & Machine Learning , 1989 .

[57]  Nando de Freitas,et al.  Robust Full Bayesian Learning for Radial Basis Networks , 2001, Neural Computation.

[58]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[59]  Refik Soyer,et al.  Bayesian Methods for Nonlinear Classification and Regression , 2004, Technometrics.

[60]  Howell Tong,et al.  Threshold autoregression, limit cycles and cyclical data- with discussion , 1980 .

[61]  D. Mackay,et al.  A Practical Bayesian Framework for Backprop Networks , 1991 .

[62]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.