Chapter 25 A Comparison of Data-Mining Techniques in Predictive Soil Mapping

Abstract To predict soil maps, data-mining techniques can be utilised. The aim of these techniques is to extract hidden predictive knowledge from large databases. In terms of soil science they are able to learn the relationship between mapped soil classes as well as soil-forming factors, which can be used to predict soil classes in comparable landscape units. Thus, it is possible to automatically build reproducible digital soil maps, helping to speed up field mapping and to reduce costs. The main objective of this chapter is to compare the ability of different data-mining techniques and algorithms from statistics and information theory, including artificial neural networks (ANNs), support vector machines (SVMs), linear regression, learning vector quantisation and classification trees. The techniques are discussed in terms of prediction accuracy and usability for GIS-based usage. Altogether 10 data-mining algorithms were tested to predict soil classes on the basis of 65-terrain attributes. Prediction accuracy is tested inside and outside the learning area to compare their generalisation ability

[1]  Peter Scull,et al.  Predictive soil mapping: a review , 2003 .

[2]  Hyunjoong Kim,et al.  Classification Trees With Unbiased Multiway Splits , 2001 .

[3]  Budiman Minasny,et al.  On digital soil mapping , 2003 .

[4]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[5]  Ee-Peng Lim,et al.  Web classification using support vector machine , 2002, WIDM '02.

[6]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[7]  Michael E. Hodgson,et al.  A Cartographic Modeling Approach for Surface Orientation-Related Applications , 1999 .

[8]  D. Tarboton A new method for the determination of flow directions and upslope areas in grid digital elevation models , 1997 .

[9]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[10]  Tsunenori Ishioka,et al.  Evaluation of criteria for information retrieval , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[11]  Berthold K. P. Horn,et al.  Hill shading and the reflectance map , 1981, Proceedings of the IEEE.

[12]  M. Nogami Geomorphometric measures for digital elevation models. , 1995 .

[13]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[14]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[15]  Teuvo Kohonen,et al.  Improved versions of learning vector quantization , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  Juan Julián Merelo Guervós,et al.  G-lvq, a Combination of Genetic Algorithms and Lvq , 1995, ICANNGA.

[19]  K. Beven,et al.  THE PREDICTION OF HILLSLOPE FLOW PATHS FOR DISTRIBUTED HYDROLOGICAL MODELLING USING DIGITAL TERRAIN MODELS , 1991 .

[20]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[21]  Kai-Tai Fang,et al.  The Classification Tree Combined with SIR and Its Applications to Classification of Mass Spectra , 2003, Journal of Data Science.

[22]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[23]  S.F. Crone,et al.  Empirical comparison and evaluation of classifier performance for data mining in customer relationship management , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[24]  P. A. Shary,et al.  Fundamental quantitative methods of land surface analysis , 2002 .

[25]  K. Beven,et al.  A physically based, variable contributing area model of basin hydrology , 1979 .

[26]  C. Thorne,et al.  Quantitative analysis of land surface topography , 1987 .

[27]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[28]  Bart Kosko,et al.  Neural networks for signal processing , 1992 .

[29]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[30]  Thorsten Behrens,et al.  Digital soil mapping using artificial neural networks , 2005 .