Quadratic-radial-basis-function-kernel for classifying multi-class agricultural datasets with continuous attributes

Abstract Classification of agricultural data such as soil data and crop data is significant as it allows the stakeholders to make meaningful decisions for farming. Soil classification aids farmers in deciding the type of crop to be sown for a particular type of soil. Similarly, wheat variety classification assists in selecting the right type of wheat for a particular product. Current methods used for classifying agricultural data are mostly manual. These methods involve agriculture field visits and surveys and are labor-intensive, expensive, and prone to human error. Recently, data mining techniques such as decision trees, k-nearest neighbors (k-NN), support vector machine (SVM), and Naive Bayes (NB) have been used in classification of agricultural data such as soil, crops, and land cover. The resulting classification aid the decision making process of government organizations and agro-industries in the field of agriculture. SVM is a popular approach for data classification. A recent study on SVM highlighted the fact that using multiple kernels instead of a single kernel would lead to better performance because of the greater learning and generalization power. In this work, a hybrid kernel based support vector machine (H-SVM) is proposed for classifying multi-class agricultural datasets having continuous attributes. Genetic algorithm (GA) or gradient descent (GD) methods are utilized to select the SVM parameters C and γ. The proposed kernel is called the quadratic-radial-basis-function kernel (QRK) and it combines both quadratic and radial basis function (RBF) kernels. The proposed classifier has the ability to classify all kinds of multi-class agricultural datasets with continuous features. Rigorous experiments using the proposed method are performed on standard benchmark and real world agriculture datasets. The results reveal a significant performance improvement over state of the art methods such as NB, k-NN, and SVM in terms of performance metrics such as accuracy, sensitivity, specificity, precision, and F-score.

[1]  Sven F. Crone,et al.  Genetic Algorithms for Support Vector Machine Model Selection , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[2]  Xinming Ma,et al.  The Research of Support Vector Machine in Agricultural Data Classification , 2011, CCTA.

[3]  Ganesan Kaliyaperumal,et al.  Multi-class classification using hybrid soft decision model for agriculture crop selection , 2016, Neural Computing and Applications.

[4]  Alex A. Freitas,et al.  A review of performance evaluation measures for hierarchical classifiers , 2007 .

[5]  Zhongheng Zhang,et al.  Introduction to machine learning: k-nearest neighbors. , 2016, Annals of translational medicine.

[6]  I. Introduction Application of Classification Technique in Data Mining for Agricultural Land , 2015 .

[7]  Rohilah Sahak,et al.  Choice for a support vector machine kernel function for recognizing asphyxia from infant cries , 2009, 2009 IEEE Symposium on Industrial Electronics & Applications.

[8]  R. Romero,et al.  A Linear-RBF Multikernel SVM to Classify Big Text Corpora , 2015, BioMed research international.

[9]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[10]  Rita McCue A Comparison of the Accuracy of Support Vector Machine and Naı̈ve Bayes Algorithms In Spam Classification , 2009 .

[11]  R. Surase,et al.  Multiple Crop Classification Using Various Support Vector Machine Kernel Functions , 2015 .

[12]  Panos M. Pardalos,et al.  A survey of data mining techniques applied to agriculture , 2009, Oper. Res..

[13]  Zhe Li,et al.  Research on Combination Kernel Function of Support Vector Machine , 2008, 2008 International Conference on Computer Science and Software Engineering.

[14]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[15]  Balwant A. Sonkamble,et al.  A Novel Linear-Polynomial Kernel to Construct Support Vector Machines for Speech Recognition , 2011 .

[16]  Genetic Algorithms for the Optimization of Support Vector Machines in Credit Risk Rating , 2007 .

[17]  Jaime G. Carbonell,et al.  Parameter Influence in Genetic Algorithm Optimization of Support Vector Machines , 2012, PACBB.

[18]  Qingsong Zhu,et al.  A Novel Image Matting Approach Based on Naive Bayes Classifier , 2012, ICIC.

[19]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[20]  L. Arockiam,et al.  Brief survey of application of data mining techniques to agriculture. , 2010 .

[21]  D. Karthik,et al.  Land Characterizations Based on Soil Properties Using Clustering Techniques , 2014 .

[22]  Cigdem Inan Aci,et al.  A hybrid classification method of k nearest neighbor, Bayesian methods and genetic algorithm , 2010, Expert Syst. Appl..

[23]  Bobby D. Gerardo,et al.  Agricultural Crops Classification Models Based on PCA-GA Implementation in Data Mining , 2014 .

[24]  Yashwant Prasad Singh,et al.  Multi-class Support Vector Machine (SVM) Classifiers -- An Application in Hypothyroid Detection and Classification , 2011, 2011 Sixth International Conference on Bio-Inspired Computing: Theories and Applications.

[25]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[26]  E. A. Zanaty,et al.  Support Vector Machines (SVMs) versus Multilayer Perception (MLP) in data classification , 2012 .

[27]  Brian Johnson,et al.  Classifying a high resolution image of an urban area using super-object information , 2013 .

[28]  Raj Kamal,et al.  A hybrid ensemble for classification in multiclass datasets: An application to oilseed disease dataset , 2016, Comput. Electron. Agric..

[29]  Chih-Jen Lin,et al.  A Comparison of Methods for Multi-class Support Vector Machines , 2015 .

[30]  Irina Rish,et al.  An empirical study of the naive Bayes classifier , 2001 .

[31]  Brian Johnson,et al.  High-resolution urban land-cover classification using a competitive multi-scale object-based approach , 2013 .