Scalable data-driven modeling of spatio-temporal systems: Weather forecasting

In this paper, a new data-driven method for short-range forecasting of spatio-temporal systems is proposed. It uses NCEP data as raw data to construct forecasting model. The global model consists of several local models. Each local model is constructed in three steps. In the first step, a local dataset is constructed based on NCEP raw data. This dataset is a very high-dimensional data with huge number of redundant and irrelevant features. In the second step, a feature selection method named GRASP is applied on the local dataset and produces a new local dataset whose features are reduced significantly. In the third step, a regression ensemble method called Bagging is used to construct a local model. Both GRASP and Bagging methods are scalable modules with respect to the computational power needed. The proposed method makes it possible to control the trade-off between speed and precision. In addition to the scalability, the proposed method, in some points produces forecasts more precise than the GFS system.

[1]  Trond Kvamsdal,et al.  A Multiscale Approach to Micrositing of Wind Turbines , 2012 .

[2]  S. Blome,et al.  Analysis of spatio-temporal patterns of African swine fever cases in Russian wild boar does not reveal an endemic situation. , 2014, Preventive veterinary medicine.

[3]  David J. Hill,et al.  Anomaly detection in streaming environmental sensor data: A data-driven modeling approach , 2010, Environ. Model. Softw..

[4]  J. Stopa,et al.  Intercomparison of wind and wave data from the ECMWF Reanalysis Interim and the NCEP Climate Forecast System Reanalysis , 2014 .

[5]  Sean Holly,et al.  Spatial and Temporal Diffusion of House Prices in the UK , 2010, SSRN Electronic Journal.

[6]  Elisa Tosetti,et al.  Real estate market and financial stability in US metropolitan areas: A dynamic model with spatial effects , 2014 .

[7]  Albrecht Weerts,et al.  Post-processing ECMWF precipitation and temperature ensemble reforecasts for operational hydrologic forecasting at various spatial scales ☆ , 2013 .

[8]  James V. Zidek,et al.  A case study in preferential sampling: Long term monitoring of air pollution in the UK , 2014 .

[9]  Gwo-Fong Lin,et al.  Ensemble forecasting of typhoon rainfall and floods over a mountainous watershed in Taiwan , 2013 .

[10]  Jianxue Wang,et al.  Review on probabilistic forecasting of wind power generation , 2014 .

[11]  Alexander Ignatov,et al.  Validation of clear-sky radiances over oceans simulated with MODTRAN4.2 and global NCEP GDAS fields against nighttime NOAA15-18 and MetOp-A AVHRR data , 2008 .

[12]  Brian J. Hoskins,et al.  How well does the ECMWF Ensemble Prediction System predict blocking? , 2003 .

[13]  Peter L. M. Goethals,et al.  Development and assessment of ecological models in the context of the European Water Framework Directive: Key issues for trainers in data-driven modeling approaches , 2013, Ecol. Informatics.

[14]  H. K. Chang,et al.  Neural network with multi-trend simulating transfer function for forecasting typhoon wave , 2006, Adv. Eng. Softw..

[15]  Lars Isaksen,et al.  Potential use of an ensemble of analyses in the ECMWF Ensemble Prediction System , 2008 .

[16]  Bong-Chul Seo,et al.  Spatial and temporal modeling of radar rainfall uncertainties , 2014 .

[17]  Jaromír Antoch,et al.  Data driven modelling of vertical atmospheric radiation. , 2011, Journal of environmental radioactivity.

[18]  Peter Knippertz,et al.  Equatorward breaking Rossby waves over the North Atlantic and Mediterranean region in the ECMWF operational Ensemble Prediction System , 2014 .

[19]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[20]  Narciso García,et al.  Improved background modeling for real-time spatio-temporal non-parametric moving object detection strategies , 2013, Image Vis. Comput..

[21]  Jose Miguel Puerta,et al.  A GRASP algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets , 2011, Pattern Recognit. Lett..

[22]  Chen Lin,et al.  LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy , 2014, Neurocomputing.

[23]  André L. V. Coelho,et al.  On the evolutionary design of heterogeneous Bagging models , 2010, Neurocomputing.

[24]  K. Nechvíle The High Resolution , 2005 .

[25]  Juan Carlos Niebles,et al.  Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers , 2013, Adv. Eng. Informatics.

[26]  Jan Kleissl,et al.  A high-resolution, cloud-assimilating numerical weather prediction model for solar irradiance forecasting , 2013 .

[27]  Alessandro Filippo,et al.  Application of Artificial Neural Network (ANN) to improve forecasting of sea level , 2012 .

[28]  J. Kleissl,et al.  Evaluation of numerical weather prediction for intra-day solar forecasting in the continental United States , 2011 .

[29]  Victor Koren,et al.  Physically-based modifications to the Sacramento Soil Moisture Accounting model. Part A: Modeling the effects of frozen ground on the runoff generation process , 2014 .

[30]  Celso C. Ribeiro,et al.  Greedy Randomized Adaptive Search Procedures: Advances, Hybridizations, and Applications , 2010 .

[31]  Wenge Wei,et al.  Data mining methods for hydroclimatic forecasting , 2011 .

[32]  Mauricio G. C. Resende,et al.  Greedy Randomized Adaptive Search Procedures: Advances and Extensions , 2018, Handbook of Metaheuristics.

[33]  Pak Wai Chan,et al.  Standardization of raw wind speed data under complex terrain conditions: A data-driven scheme , 2014 .

[34]  Kathleen M. Baker,et al.  Point-trained models in a grid environment: Transforming a potato late blight risk forecast for use with the National Digital Forecast Database , 2014 .

[35]  Robert Milne,et al.  Understanding landscape patterns of temporal variability in avian populations to improve environmental impact assessments , 2013, Ecol. Informatics.

[36]  T. Anastasio,et al.  Data-driven modeling of Alzheimer disease pathogenesis. , 2011, Journal of theoretical biology.

[37]  Qinghua Hu,et al.  Margin distribution based bagging pruning , 2012, Neurocomputing.

[38]  Ali Fares,et al.  Rainfall-runoff modeling in a flashy tropical watershed using the distributed HL-RDHM model , 2014 .

[39]  P Hyde,et al.  Forecasting PM10 in metropolitan areas: Efficacy of neural networks. , 2012, Environmental pollution.

[40]  Dezhong Yao,et al.  Simultaneous EEG-fMRI: Trial level spatio-temporal fusion for hierarchically reliable information discovery , 2014, NeuroImage.

[41]  Parthasarathi Mukhopadhyay,et al.  Cloud microphysical properties as revealed by the CAIPEEX and satellite observations and evaluation of a cloud system resolving model simulation of contrasting large scale environments , 2011 .

[42]  N. J. Ferreira,et al.  Artificial neural network technique for rainfall forecasting applied to the São Paulo region , 2005 .

[43]  Susan Greenfield,et al.  High-resolution spatio-temporal bioactivity of a novel peptide revealed by optical imaging in rat orbitofrontal cortex in vitro: Possible implications for neurodegenerative diseases , 2013, Neuropharmacology.

[44]  Sarah C. Jones,et al.  Impact of perturbation methods in the ECMWF ensemble prediction system on tropical cyclone forecasts , 2012 .

[45]  Mauricio G. C. Resende,et al.  Greedy Randomized Adaptive Search Procedures , 1995, J. Glob. Optim..

[46]  Severine Deguen,et al.  Air quality and social deprivation in four French metropolitan areas--a localized spatio-temporal environmental inequality analysis. , 2014, Environmental research.

[47]  Mohsen Moshki,et al.  Scalable Feature Selection in High-Dimensional Data Based on GRASP , 2015, Appl. Artif. Intell..

[48]  Daniel Hernández-Lobato,et al.  Empirical analysis and evaluation of approximate techniques for pruning regression bagging ensembles , 2011, Neurocomputing.

[49]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[50]  Michael J. Friedel,et al.  Data-driven modeling of surface temperature anomaly and solar activity trends , 2012, Environ. Model. Softw..

[51]  M. Resende,et al.  A probabilistic heuristic for a computationally difficult set covering problem , 1989 .

[52]  Ernesto Araujo,et al.  Neural network and fuzzy logic statistical downscaling of atmospheric circulation-type specific weather pattern for rainfall forecasting , 2014, Appl. Soft Comput..

[53]  Marius Thériault,et al.  Commuter rail accessibility and house values: The case of the Montreal South Shore, Canada, 1992–2009 , 2013 .

[54]  Mauricio G. C. Resende,et al.  Effective Application of GRASP , 2011 .

[55]  Kevin Judd,et al.  Forecasting with imperfect models, dynamically constrained inverse problems, and gradient descent algorithms , 2008 .

[56]  John L. Schnase,et al.  MERRA Analytic Services: Meeting the Big Data challenges of climate science through cloud-enabled Climate Analytics-as-a-Service , 2013, Comput. Environ. Urban Syst..

[57]  Dong-Sin Shih,et al.  Improving our understanding of flood forecasting using earlier hydro-meteorological intelligence , 2014 .