A Novel Borda Count based Feature Ranking and Feature Fusion Strategy to Attain Effective Climatic Features for Rice Yield Prediction

An attempt has been made in the agricultural field to predict the effect of climatic variability based on rice crop production and climatic features of three coastal regions of Odisha, a state of India. The novelty of this work is Borda Count based fusion strategy on the ranked features obtained from various ranking methodologies. The proposed prediction model works in three phases; in the first phase, three feature ranking approaches such as; Random Forest, Support Vector Regression-Recursive Feature Elimination (SVR-RFE) and F-Test are applied individually on the two datasets of three coastal areas and features are ranked as per their algorithm. In the second phase; Borda Count as a fusion method has been implemented on those ranked features from the above phase to obtain the top five best features. The multi quadratic activation function based Extreme Learning Machine (ELM) has been used to predict the rice crop yield using those ranked features obtained from fusion-based raking strategy and the number of varying features are obtained which gives prediction accuracy above 99% in the third phase of experimentation. Finally, the statistical paired T-test has been used to evaluate and validate the significance of the proposed fusion based ranking prediction model. This prediction model not only predicts the rice yield per hector but also able to obtain the significant or most affecting features during Rabi and Kharif seasons. From the observations made during experimentation, it has been found that; relative humidity is playing a vital role along with the minimum and maximum temperature for rice crop yield during Rabi and Kharif seasons.

[1]  José Luis García-Lapresta,et al.  Defining the Borda count in a linguistic decision making context , 2009, Inf. Sci..

[2]  Dae-Won Kim,et al.  Fast multi-label feature selection based on information-theoretic feature ranking , 2015, Pattern Recognit..

[3]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[4]  Ahmet Erdil,et al.  The prediction of meteorological variables using artificial neural network , 2012, Neural Computing and Applications.

[5]  Ching-Chiang Yeh,et al.  A hybrid KMV model, random forests and rough set theory approach for credit rating , 2012, Knowl. Based Syst..

[6]  Xiangfeng Wang,et al.  Machine learning for Big Data analytics in plants. , 2014, Trends in plant science.

[7]  Xiaotie Deng,et al.  Empirical analysis: stock market prediction via extreme learning machine , 2014, Neural Computing and Applications.

[8]  Chih-Chieh Yang,et al.  Multiclass SVM-RFE for product form feature selection , 2008, Expert Syst. Appl..

[9]  S. Morita,et al.  Countermeasures for heat damage in rice grain quality under climate change , 2016 .

[10]  H. Md. Azamathulla,et al.  Prediction of soil erodibility factor for Peninsular Malaysia soil series using ANN , 2012, Neural Computing and Applications.

[11]  H. Chujo,et al.  Studies on the Effect of the Relative Humidity of the Atmosphere on the Growth and Physiology of Rice Plants , 2000 .

[12]  Fan Min,et al.  Three-way recommender systems based on random forests , 2016, Knowl. Based Syst..

[13]  Debahuti Mishra,et al.  A hybridized ELM using self-adaptive multi-population-based Jaya algorithm for currency exchange prediction: an empirical assessment , 2019, Neural Computing and Applications.

[14]  Debi Prasanna Acharjya,et al.  Crop suitability prediction in Vellore District using rough set on fuzzy approximation space and neural network , 2017, Neural Computing and Applications.

[15]  H. Dahal,et al.  IDENTIFYING ASSOCIATIONS BETWEEN SOIL AND PRODUCTION VARIABLES USING LINEAR MULTIPLE REGRESSION MODELS , 2013 .

[16]  F BoccaFelipe,et al.  The effect of tuning, feature engineering, and feature selection in data mining applied to rainfed sugarcane yield modelling , 2016 .

[17]  Anil Kumar Singh,et al.  Climate Change Adaptation and Mitigation Strategies in Rainfed Agriculture , 2015 .

[18]  Debahuti Mishra,et al.  SVM-BT-RFE: An improved gene selection framework using Bayesian T-test embedded in support vector machine (recursive feature elimination) algorithm , 2015 .

[19]  Harrie de Swart,et al.  The Borda Majority Count , 2015, Inf. Sci..

[20]  Dimitrios I. Fotiadis,et al.  Modifications of the construction and voting mechanisms of the Random Forests Algorithm , 2013, Data Knowl. Eng..

[21]  Jan Komorowski,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm486 Data and text mining Monte Carlo , 2022 .

[22]  L. S. Rathore,et al.  Effects of Climate Change on Rice Production in the Tropical Humid Climate of Kerala, India , 2000 .

[23]  Pawel Teisseyre,et al.  Feature ranking for multi-label classification using Markov networks , 2016, Neurocomputing.

[24]  Khorshed Alam,et al.  Exploring the relationship between climate change and rice yield in Bangladesh: An analysis of time series data , 2012 .

[25]  Swaroopa Rani,et al.  An assessment of regional vulnerability of rice to climate change in India , 2013, Climatic Change.

[26]  Chih-Chiang Wei,et al.  Soft computing techniques in ensemble precipitation nowcast , 2013, Appl. Soft Comput..

[27]  T. Hirano,et al.  Studies on the Effect of the Relative Humidity of the Atmosphere on the Growth and Physiology of Rice Plants : VIII. Effect of ambient humidity on dry matter production and nitrogen absorption at various temperatures , 1993 .

[28]  Yunming Ye,et al.  ForesTexter: An efficient random forest algorithm for imbalanced text categorization , 2014, Knowl. Based Syst..

[29]  Stijn Reinhard,et al.  Measuring the effects of extreme weather events on yields , 2016 .

[30]  V. Narasimhamurthy Rice Crop Yield Forecasting Using Random Forest Algorithm SML , 2017 .

[31]  Mohamed S. Kamel,et al.  Significance Test for Feature Subset Selection on Image Recognition , 2004, ICIAR.

[32]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[33]  Adriaan Van Niekerk,et al.  Value of dimensionality reduction for crop differentiation with multi-temporal imagery and machine learning , 2017, Comput. Electron. Agric..

[34]  George D. C. Cavalcanti,et al.  META-DES.Oracle: Meta-learning and feature selection for dynamic ensemble selection , 2017, Inf. Fusion.

[35]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[36]  David Zhang,et al.  Feature selection and analysis on correlated gas sensor data with recursive feature elimination , 2015 .

[37]  George Lee,et al.  Evaluating feature selection strategies for high dimensional, small sample size datasets , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[38]  Farshad Fotouhi,et al.  Bias and stability of single variable classifiers for feature ranking and selection , 2014, Expert Syst. Appl..

[39]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[40]  Felipe Ferreira Bocca,et al.  The effect of tuning, feature engineering, and feature selection in data mining applied to rainfed sugarcane yield modelling , 2016, Comput. Electron. Agric..

[41]  Mengjie Zhang,et al.  Differential evolution for filter feature selection based on information theory and feature ranking , 2018, Knowl. Based Syst..

[42]  Petros Xanthopoulos,et al.  Online feature importance ranking based on sensitivity analysis , 2017, Expert Syst. Appl..

[43]  ShiehMeng-Dar,et al.  Multiclass SVM-RFE for product form feature selection , 2008 .