Data Mining Attribute Selection Approach for Drought Modeling: A Case Study for Greater Horn of Africa

The objectives of this paper were to 1) develop an empirical method for selecting relevant attributes for modelling drought, and 2) select the most relevant attribute for drought modelling and predictions in the Greater Horn of Africa (GHA). Twenty four attributes from different domain areas were used for this experimental analysis. Two attribute selection algorithms were used for the current study: Principal Component Analysis (PCA) and correlation-based attribute selection (CAS). Using the PCA and CAS algorithms, the 24 attributes were ranked by their merit value. Accordingly, 15 attributes were selected for modelling drought in GHA. The average merit values for the selected attributes ranged from 0.5 to 0.9. Future research may evaluate the developed methodology using relevant classification techniques and quantify the actual information gain from the developed approach.

[1]  H. Abdi,et al.  Principal component analysis , 2010 .

[2]  Famine Early Warning Systems Network — Informing Climate Change Adaptation Series A Climate Trend Analysis of Uganda , 2011 .

[3]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[4]  P. Conilione,et al.  A Comparative Study on Feature Selection for E . coli Promoter Recognition A Comparative Study on Feature Selection for E . coli Promoter Recognition , 2006 .

[5]  T. Tadesse,et al.  Drought Spatial Object Prediction Approach using Artificial Neural Network , 2016 .

[6]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[7]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[8]  Koushal Kumar,et al.  Network Intrusion Detection with Feature Selection Techniques using Machine-Learning Algorithms , 2016 .

[9]  Y. Liu,et al.  Data mining feature selection for credit scoring models , 2005, J. Oper. Res. Soc..

[10]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[11]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[12]  Berhan Getachew Knowledge Discovery From Satellite Images For Drought Monitoring , 2013 .

[13]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[14]  F. Agakov,et al.  Application of high-dimensional feature selection: evaluation for genomic prediction in man , 2015, Scientific Reports.

[15]  David J. Hand,et al.  Statistical Classification Methods in Consumer Credit Scoring: a Review , 1997 .

[16]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[17]  T. Hamill,et al.  National Oceanic and Atmospheric Administration (NOAA), Earth System Research Laboratory, , 2014 .

[18]  J. Dracup,et al.  An aggregate drought index: Assessing drought severity based on fluctuations in the hydrologic cycle and surface water storage , 2004 .