An assessment of the effectiveness of decision tree methods for land cover classification

Abstract Choice of a classification algorithm is generally based upon a number of factors, among which are availability of software, ease of use, and performance, measured here by overall classification accuracy. The maximum likelihood (ML) procedure is, for many users, the algorithm of choice because of its ready availability and the fact that it does not require an extended training process. Artificial neural networks (ANNs) are now widely used by researchers, but their operational applications are hindered by the need for the user to specify the configuration of the network architecture and to provide values for a number of parameters, both of which affect performance. The ANN also requires an extended training phase. In the past few years, the use of decision trees (DTs) to classify remotely sensed data has increased. Proponents of the method claim that it has a number of advantages over the ML and ANN algorithms. The DT is computationally fast, make no statistical assumptions, and can handle data that are represented on different measurement scales. Software to implement DTs is readily available over the Internet. Pruning of DTs can make them smaller and more easily interpretable, while the use of boosting techniques can improve performance. In this study, separate test and training data sets from two different geographical areas and two different sensors—multispectral Landsat ETM+ and hyperspectral DAIS—are used to evaluate the performance of univariate and multivariate DTs for land cover classification. Factors considered are: the effects of variations in training data set size and of the dimensionality of the feature space, together with the impact of boosting, attribute selection measures, and pruning. The level of classification accuracy achieved by the DT is compared to results from back-propagating ANN and the ML classifiers. Our results indicate that the performance of the univariate DT is acceptably good in comparison with that of other classifiers, except with high-dimensional data. Classification accuracy increases linearly with training data set size to a limit of 300 pixels per class in this case. Multivariate DTs do not appear to perform better than univariate DTs. While boosting produces an increase in classification accuracy of between 3% and 6%, the use of attribute selection methods does not appear to be justified in terms of accuracy increases. However, neither the univariate DT nor the multivariate DT performed as well as the ANN or ML classifiers with high-dimensional data.

[1]  Edward J. Delp,et al.  An iterative growing and pruning algorithm for classification tree design , 1989, Conference Proceedings., IEEE International Conference on Systems, Man and Cybernetics.

[2]  Donato Malerba,et al.  A Comparative Analysis of Methods for Pruning Decision Trees , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[4]  Graeme G. Wilkinson,et al.  Open Questions in Neurocomputing for Earth Observation , 1997 .

[5]  C. Brodley,et al.  Decision tree classification of land cover from remotely sensed data , 1997 .

[6]  Giles M. Foody,et al.  An evaluation of some factors affecting the accuracy of classification by an artificial neural network , 1997 .

[7]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[8]  Robert A. Schowengerdt,et al.  Remote sensing, models, and methods for image processing , 1997 .

[9]  David A. Landgrebe,et al.  Classification of High Dimensional Data , 1998 .

[10]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[11]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[12]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[13]  Taşkin Kavzoĝlu,et al.  An investigation of the design and use of feed-forward artificial neural networks in the classification of remotely sensed images , 2001 .

[14]  John Mingers,et al.  An Empirical Comparison of Pruning Methods for Decision Tree Induction , 1989, Machine Learning.

[15]  J. Borak Feature selection and land cover classification of a MODIS-like data set for a semiarid environment , 1999 .

[16]  George F. Hepner,et al.  Artificial neural network classification using a minimal training set - Comparison to conventional supervised classification , 1990 .

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[18]  John Mingers,et al.  An empirical comparison of selection measures for decision-tree induction , 2004, Machine Learning.

[19]  Giles M. Foody,et al.  The effect of training set size and composition on artificial neural network classification , 1995 .

[20]  P. Utgoff,et al.  Multivariate Versus Univariate Decision Trees , 1992 .

[21]  B. S. Daya Sagar Computer processing of remotely sensed images: an introduction. 2nd Edition: Paul M. Mather. Wiley, Chichester, 1999, 292pp., US$ 65.00 (Includes CD with image processing software for Windows), ISBN 0-471-98550-3. , 2001 .

[22]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[23]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[24]  David A. Landgrebe On the relationship between class definition precision and classification accuracy in hyperspectral analysis , 2000, IGARSS 2000. IEEE 2000 International Geoscience and Remote Sensing Symposium. Taking the Pulse of the Planet: The Role of Remote Sensing in Managing the Environment. Proceedings (Cat. No.00CH37120).

[25]  Philip H. Swain,et al.  Purdue e-Pubs , 2022 .

[26]  Sarunas Raudys,et al.  On Dimensionality, Sample Size, Classification Error, and Complexity of Classification Algorithm in Pattern Recognition , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Qiong Jackson,et al.  An adaptive classifier design for high-dimensional data analysis with a limited training data set , 2001, IEEE Trans. Geosci. Remote. Sens..

[28]  Paul M. Mather,et al.  Computer Processing of Remotely-Sensed Images: An Introduction , 1988 .

[29]  Robert A. Schowengerdt,et al.  A detailed comparison of backpropagation neural network and maximum-likelihood classifiers for urban land use classification , 1995, IEEE Trans. Geosci. Remote. Sens..

[30]  A. Strahler,et al.  Application of the MODIS global supervised classification model to vegetation and land cover mapping of Central America , 2000 .

[31]  Christian Borgelt,et al.  Concepts for Probabilistic and Possibilistic Induction of Decision Trees on Real World Data , 2004 .

[32]  Tim Oates,et al.  The Effects of Training Set Size on Decision Tree Complexity , 1997, ICML.

[33]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[34]  K. Bennett,et al.  A support vector machine approach to decision trees , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[35]  Carla E. Brodley,et al.  An Incremental Method for Finding Multivariate Splits for Decision Trees , 1990, ML.

[36]  Igor Kononenko,et al.  Attribute selection for modelling , 1997, Future Gener. Comput. Syst..

[37]  Giles M. Foody,et al.  On the compensation for chance agreement in image classification accuracy assessment, Photogram , 1992 .