Classification of agricultural soil parameters in India

Classification of soil data from Marathwada (India) avoiding chemical analysis.Random forest overcomes 90% of the maximum kappa in all the classification problems.Cohen kappa about 6590% for classification of village-wise soil fertility indices.Classification of: N2O, P2O5 and K2O for fertilizer recommendation; soil pH; soil type and suitable crop.Some classification models remain valid across different regions in India. One of the backbones of the Indian economy is agriculture, which is conditioned by the poor soil fertility. In this study we use chemical soil measurements to classify many relevant soil parameters: village-wise fertility indices of organic carbon (OC), phosphorus pentoxide (P2O5), manganese (Mn) and iron (Fe); soil pH and type; soil nutrients nitrous oxide (N2O), P2O5 and potassium oxide (K2O), in order to recommend suitable amounts of fertilizers; and preferable crop. To classify these soil parameters allows to save time of specialized technicians developing expensive chemical analysis. These ten classification problems are solved using a collection of twenty very diverse classifiers, selected by their high performances, of families bagging, boosting, decision trees, nearest neighbors, neural networks, random forests (RF), rule based and support vector machines (SVM). The RF achieves the best performance for six of ten problems, overcoming 90% of the maximum performance in all the cases, followed by adaboost, SVM and Gaussian extreme learning machine. Although for some problems (pH,N2O,P2O5 and K2O) the performance is moderate, some classifiers (e.g. for fertility indices of P2O5,Mn and Fe) trained in one region revealed valid for other Indian regions.

[1]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[3]  Thomas C. Edwards,et al.  Machine learning for predicting soil classes in three semi-arid landscapes , 2015 .

[4]  Shusen Wang,et al.  Crop yield forecasting on the Canadian Prairies using MODIS NDVI data , 2011 .

[5]  Eibe Frank,et al.  Combining Naive Bayes and Decision Tables , 2008, FLAIRS.

[6]  Dipak Sarkar,et al.  Emerging deficiency of potassium in soils and crops of India , 2011 .

[7]  A. Viera,et al.  Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[8]  S. R. Olsen,et al.  Estimation of available phosphorus in soils by extraction with sodium bicarbonate , 1954 .

[9]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[10]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[11]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[12]  M. Jackson Soil Chemical Analysis , 2014 .

[13]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[14]  S. Panigrahy,et al.  Mapping of crop rotation using multidate Indian Remote Sensing Satellite digital data , 1997 .

[15]  Ward Chesworth,et al.  Encyclopedia of soil science. , 2008 .

[16]  K. P. Adhiya,et al.  A Study of Clustering Techniques for Crop Prediction - A Survey , 2014 .

[17]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[18]  Frédéric Baudron,et al.  Crop residue management and soil health: A systems analysis , 2015 .

[19]  Jessica Andrea Carballido,et al.  Using classification algorithms for predicting durum wheat yield in the province of Buenos Aires , 2013 .

[20]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[22]  Xanthoula Eirini Pantazi,et al.  Wheat yield prediction using machine learning and advanced sensing techniques , 2016, Comput. Electron. Agric..

[23]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[24]  Raymond J. Mooney,et al.  Creating diversity in ensembles using artificial data , 2005, Inf. Fusion.

[25]  L. A. Richards Diagnosis and Improvement of Saline and Alkali Soils , 1954 .

[26]  R. A. Bowman,et al.  Spectroscopic Method for Estimation of Soil Organic Carbon , 1991 .

[27]  Jooyoung Park,et al.  Approximation and Radial-Basis-Function Networks , 1993, Neural Computation.

[28]  Kenichi Tatsumi,et al.  Crop classification of upland fields using Random forest of time-series Landsat 7 ETM+ data , 2015, Comput. Electron. Agric..

[29]  C. H. Jones Activity of Organic Nitrogen as Measured by the Alkaline Permanganate Method. , 1912 .

[30]  C. L. Ford,et al.  Determination of Sodium and Potassium Oxides by Flame Photometry in Portland Cement Raw Materials and Mixtures and Similar Silicates , 1954 .

[31]  Panos M. Pardalos,et al.  A survey of data mining techniques applied to agriculture , 2009, Oper. Res..

[32]  Majid Rashidi,et al.  MODELING OF SOIL TOTAL NITROGEN BASED ON SOIL ORGANIC CARBON , 2009 .

[33]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[34]  D. Legates,et al.  Crop identification using harmonic analysis of time-series AVHRR NDVI data , 2002 .

[35]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[36]  A. Brenning,et al.  Assessing fruit-tree crop classification from Landsat-8 time series for the Maipo Valley, Chile , 2015 .

[37]  Janet Franklin,et al.  Mapping land-cover modifications over large areas: A comparison of machine learning algorithms , 2008 .

[38]  C. Mandal,et al.  Agro-ecological regions of India. , 1990 .

[39]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[40]  Esteban Alfaro Cortés,et al.  Multiclass Corporate Failure Prediction by Adaboost.M1 , 2007 .

[41]  G. E. Leggett,et al.  The DTPA-Extractable Iron, Manganese, Copper, and Zinc from Neutral and Calcareous Soils Dried Under Different Conditions , 1983 .

[42]  Gonzalo Pajares,et al.  Support Vector Machines for crop/weeds identification in maize fields , 2012, Expert Syst. Appl..

[43]  D. W. Reeves The role of soil organic matter in maintaining soil quality in continuous cropping systems , 1997 .

[44]  Y. Chtioui,et al.  A generalized regression neural network and its application for leaf wetness prediction to forecast plant disease , 1999 .

[45]  B. Minasny,et al.  Comparing data mining classifiers to predict spatial distribution of USDA-family soil groups in Baneh region, Iran , 2015 .

[46]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[47]  C. S. Minot Die Elemente der Entwickelungslehre des Menschen und der Wirbelthiere , 1900 .

[48]  Jin Zhang,et al.  An overview and comparison of machine-learning techniques for classification purposes in digital soil mapping , 2016 .

[49]  J. W. van Groenigen,et al.  The soil N cycle: new insights and key challenges , 2014 .

[50]  Rattan Lal,et al.  Towards a standard technique for soil quality assessment , 2016 .

[51]  Rattan Lal,et al.  Soil fertility concepts over the past two centuries: the importance attributed to soil organic matter in developed and developing countries , 2012 .

[52]  Peter Reutemann,et al.  The use of data mining to assist crop protection decisions on kiwifruit in New Zealand , 2014 .