Comparative Analysis of Data Mining Techniques for Malaysian Rainfall Prediction

Climate change prediction analyses the behaviours of weather for a specific time. Rainfall forecasting is a climate change task where specific features such as humidity and wind will be used to predict rainfall in specific locations. Rainfall prediction can be achieved using classification task under Data Mining. Different techniques lead to different performances depending on rainfall data representation including representation for long term (months) patterns and short-term (daily) patterns. Selecting an appropriate technique for a specific duration of rainfall is a challenging task. This study analyses multiple classifiers such as Naive Bayes, Support Vector Machine, Decision Tree, Neural Network and Random Forest for rainfall prediction using Malaysian data. The dataset has been collected from multiple stations in Selangor, Malaysia. Several pre-processing tasks have been applied in order to resolve missing values and eliminating noise. The experimental results show that with small training data (10%) from 1581 instances Random Forest correctly classified 1043 instances. This is the strength of an ensemble of trees in Random Forest where a group of classifiers can jointly beat a single classifier.

[1]  J. Houghton,et al.  Climate Change 2013 - The Physical Science Basis: Working Group I Contribution to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change , 2014 .

[2]  K. Abhishek,et al.  A rainfall prediction model using artificial neural network , 2012, 2012 IEEE Control and System Graduate Research Colloquium.

[3]  B. B. Meshram,et al.  Modeling Rainfall Prediction Using Data Mining Method: A Bayesian Approach , 2013, 2013 Fifth International Conference on Computational Intelligence, Modelling and Simulation.

[4]  Othman O. Khalifa,et al.  Rainfall forecasting models using focused time-delay neural networks , 2010, International Conference on Computer and Communication Engineering (ICCCE'10).

[5]  B. Krishna,et al.  Monthly Rainfall Prediction Using Wavelet Neural Network Analysis , 2013, Water Resources Management.

[6]  Benjamin F. Zaitchik,et al.  A tool for hierarchical climate regionalization , 2015, Earth Science Informatics.

[7]  Jyothis Joseph,et al.  Rainfall Prediction using Data Mining Techniques , 2013 .

[8]  Geeta Sikka,et al.  Recent Techniques of Clustering of Time Series Data: A Survey , 2012 .

[9]  Lance Chun Che Fung,et al.  Rainfall prediction in the northeast region of Thailand using Modular Fuzzy Inference System , 2012, 2012 IEEE International Conference on Fuzzy Systems.

[10]  H. Hirose,et al.  Comparison of artificially intelligent methods in short term rainfall forecast , 2010, 2010 13th International Conference on Computer and Information Technology (ICCIT).

[11]  Kwok-Wing Chau,et al.  Prediction of rainfall time series using modular soft computingmethods , 2013, Eng. Appl. Artif. Intell..

[12]  Peter J. Webster,et al.  Corrigendum: Rethinking Indian monsoon rainfall prediction in the context of recent global warming , 2015, Nature Communications.

[13]  Dino Isa,et al.  Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine , 2008, IEEE Transactions on Knowledge and Data Engineering.

[14]  Tim Appelhans,et al.  Improving the accuracy of rainfall rates from optical satellite sensors with machine learning — A random forests-based approach applied to MSG SEVIRI , 2014 .

[15]  Deepti Gupta,et al.  A comparative study of classification algorithms for forecasting rainfall , 2015, 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions).

[16]  Ramesh S. V. Teegavarapu,et al.  Improved weighting methods, deterministic and stochastic data-driven models for estimation of missing precipitation records , 2005 .

[17]  Pamela G. Grube,et al.  The Oryx Resource Guide to El Niño and La Niña , 2002 .