Climate data clustering effects on arid and semi-arid rainfed wheat yield: a comparison of artificial intelligence and K-means approaches

Clustering algorithms are critical data mining techniques used to analyze a wide range of data. This study compares the utility of ant colony optimization (ACO), genetic algorithm (GA), and K-means methods to cluster climatic variables affecting the yield of rainfed wheat in northeast Iran from 1984 to 2010 (27 years). These variables included sunshine hours, wind speed, relative humidity, precipitation, maximum temperature, minimum temperature, and the number of wet days. Seven climatic factors with higher correlations with detrended rainfed wheat yield were selected based on Pearson correlation coefficient significance (P value < 0.1). Three variables (i.e., sunshine hours, wind, and average relative humidity) were excluded for clustering. In the next step based on Pearson correlation (P value < 0.05) between the yield, and the seven climate attributes, fitness function, and silhouette index, only four attributes with higher correlation in its cluster were selected for reclustering. Four climate attributes had an extensive association with yield, so we used four-dimensional clustering to describe the common characteristics of low-, medium-, and high-yielding years, and this is the significance of this research that we have done four-dimensional clustering. The silhouette index showed that the best number of clusters for each station was equal to three clusters. At the last step, reclustering was done through the best-selected method. The results yielded that GA was the best method.

[1]  K. Krishna,et al.  Genetic K-means algorithm. , 1999, IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society.

[2]  Erwie Zahara,et al.  A hybridized approach to data clustering , 2008, Expert Syst. Appl..

[3]  Hui Xiong,et al.  Understanding of Internal Clustering Validation Measures , 2010, 2010 IEEE International Conference on Data Mining.

[4]  Jean-Louis Deneubourg,et al.  The dynamics of collective sorting robot-like ants and ant-like robots , 1991 .

[5]  Lukasz Machnik,et al.  ACO documents clustering - details of processing and results of experiments , 2006, Ann. UMCS Informatica.

[6]  G. Hoogenboom,et al.  Estimation of meteorological drought indices based on AgMERRA precipitation data and station-observed precipitation data , 2017, Journal of Arid Land.

[7]  B. Bryan,et al.  Potential impact of climate change on wheat yield in South Australia , 2005 .

[8]  Li Shanshan,et al.  Hyperspectal image clustering using ant colony optimization(ACO) improved by K-means algorithm , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[9]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[10]  H. Salmani,et al.  Evaluation of groundwater quality in Mashhad city, using geostatistical methods in drought and wet periods , 2014 .

[11]  Implications of climate variability and future trends on wheat production and crop technology adaptations in southern regions of Portugal , 2015 .

[12]  Jessica Andrea Carballido,et al.  Using classification algorithms for predicting durum wheat yield in the province of Buenos Aires , 2013 .

[13]  Mehdi Rezaeian Zadeh,et al.  Testing for long-term trends in climatic variables in Iran , 2011 .

[14]  P. V. Vara Prasad,et al.  Temperature variability and the yield of annual crops , 2000 .

[15]  Estimation of effective precipitation for winter wheat in different regions of Iran using an Extended Soil-Water Balance Model , 2014 .

[16]  M. Dorigo,et al.  Ant System: An Autocatalytic Optimizing Process , 1991 .

[17]  Fang Wu,et al.  A Novel Spatial Clustering with Obstacles Constraints Based on Genetic Algorithms and K-Medoids , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[18]  Hossein Tabari,et al.  Analysis of trends in temperature data in arid and semi-arid regions of Iran , 2011 .

[19]  Sahin Isik,et al.  Wheat grain classification by using dense SIFT features with SVM classifier , 2016, Comput. Electron. Agric..

[20]  B. Vishnu Vardhan,et al.  Density Based Clustering Technique on Crop Yield Prediction , 2014 .

[21]  B. Kulkarni,et al.  An ant colony approach for clustering , 2004 .

[22]  Julia Handl,et al.  Improved Ant-Based Clustering and Sorting , 2002, PPSN.

[23]  Chao‐An Chen,et al.  Mechanisms for Global Warming Impacts on Precipitation Frequency and Intensity , 2012 .

[24]  A. Challinor,et al.  The observed relationships between wheat and climate in China , 2010 .

[25]  Renato Cordeiro de Amorim,et al.  A Survey on Feature Weighting Based K-Means Algorithms , 2015, Journal of Classification.

[26]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[27]  R. Sothern,et al.  U.K. Wheat Quality and Economic Value are Dependent on the North Atlantic Oscillation , 1999 .

[28]  Taher Niknam,et al.  An efficient hybrid algorithm based on modified imperialist competitive algorithm and K-means for data clustering , 2011, Eng. Appl. Artif. Intell..

[29]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[30]  Eréndira Rendón,et al.  Internal versus External cluster validation indexes , 2011 .

[31]  Aslam Muhammad,et al.  Calibration and validation of APSIM-Wheat and CERES-Wheat for spring wheat under rainfed conditions: Models evaluation and application , 2016, Comput. Electron. Agric..

[32]  Cheng-Fa Tsai,et al.  ACODF: a novel data clustering approach for data mining in large databases , 2004 .

[33]  Michael J. Laszlo,et al.  A genetic algorithm that exchanges neighboring centers for k-means clustering , 2007, Pattern Recognit. Lett..

[34]  M. Bannayan,et al.  Rainfed wheat yields under climate change in northeastern Iran , 2012 .

[35]  S. Abdullah,et al.  Hybrid of Artificial Neural Network-Genetic Algorithm for Prediction of Reference Evapotranspiration (ET?) in Arid and Semiarid Regions , 2014 .

[36]  Kyoung-jae Kim,et al.  A recommender system using GA K-means clustering in an online shopping market , 2008, Expert Syst. Appl..

[37]  R. Alvarez Predicting average regional yield and production of wheat in the Argentine Pampas by an artificial neural network approach , 2009 .

[38]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[39]  H. Tabari,et al.  Temporal variability of precipitation over Iran: 1966-2005 , 2011 .

[40]  Z. Guangsheng,et al.  Climatic suitability of the distribution of the winter wheat cultivation zone in China , 2012 .

[41]  A. Roshani,et al.  Variability of growing season indices in northeast of Iran , 2011 .

[42]  Chang Sup Sung,et al.  A tabu-search-based heuristic for clustering , 2000, Pattern Recognit..

[43]  R. J. Kuo,et al.  Application of ant K-means on clustering analysis , 2005 .

[44]  Patricio A. Vela,et al.  A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm , 2012, Expert Syst. Appl..

[45]  Boris G. Mirkin,et al.  Choosing the number of clusters , 2011, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[46]  C. Field,et al.  Global scale climate–crop yield relationships and the impacts of recent warming , 2007, Environmental Research Letters.

[47]  Renato Cordeiro de Amorim,et al.  Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering , 2012, Pattern Recognit..

[48]  A. Alizadeh,et al.  Association between climate indices, aridity index, and rainfed crop yield in northeast of Iran , 2010 .

[49]  Cheng-Fa Tsai,et al.  ACODF: a novel data clustering approach for data mining in large databases , 2004, J. Syst. Softw..

[50]  Juan Julián Merelo Guervós,et al.  Proceedings of the 7th International Conference on Parallel Problem Solving from Nature , 1996 .

[51]  S. Dong,et al.  Factors affecting summer maize yield under climate change in Shandong Province in the Huanghuaihai Region of China , 2012, International Journal of Biometeorology.

[52]  Alain Hertz,et al.  A framework for the description of evolutionary algorithms , 2000, Eur. J. Oper. Res..

[53]  Mukhtar Ahmed,et al.  Cumulative Effect of Temperature and Solar Radiation on Wheat Yield , 2011 .

[54]  Taher Niknam,et al.  An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis , 2010, Appl. Soft Comput..