Predicting probability distributions for surf height using an ensemble of mixture density networks

There is a range of potential applications of Machine Learning where it would be more useful to predict the probability distribution for a variable rather than simply the most likely value for that variable. In meteorology and in finance it is often important to know the probability of a variable falling within (or outside) different ranges. In this paper we consider the prediction of surf height with the objective of predicting if it will fall within a given 'surfable' range. Prediction problems such as this are considerably more difficult if the distribution of the phenomenon is significantly different from a normal distribution. This is the case with the surf data we have studied. To address this we use an ensemble of mixture density networks to predict the probability density function. Our evaluation shows that this is an effective solution. We also describe a web-based application that presents these predictions in a usable manner.

[1]  Dan Cornford,et al.  Neural Network-Based Wind Vector Retrieval from Satellite Scatterometer Data , 1999, Neural Computing & Applications.

[2]  A. Azzalini A class of distributions which includes the normal ones , 1985 .

[3]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[4]  T. Butt Surf Science: An Introduction to Waves for Surfing , 2001 .

[5]  A. Weigend,et al.  Predictions with Confidence Intervals ( Local Error Bars ) , 1994 .

[6]  Tom Heskes Balancing Between Bagging and Bumping , 1996, NIPS.

[7]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[8]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[9]  Harris Drucker,et al.  Improving Regressors using Boosting Techniques , 1997, ICML.

[10]  Dirk Husmeier,et al.  Neural networks for conditional probability estimation - forecasting beyond point predictions , 1999, Perspectives in neural computing.

[11]  Andreas S. Weigend,et al.  Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .

[12]  E. Dockner,et al.  Forecasting Time-dependent Conditional Densities: A Semi-non- parametric Neural Network Approach , 2000 .

[13]  Ralph Neuneier,et al.  Estimation of Conditional Densities: A Comparison of Neural Network Approaches , 1994 .

[14]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[15]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[16]  Michael P. Clements,et al.  Evaluating The Forecast of Densities of Linear and Non-Linear Models: Applications to Output Growth and Unemployment , 2000 .

[17]  L. M. M.-T. Theory of Probability , 1929, Nature.

[18]  B. M. Hill,et al.  Theory of Probability , 1990 .