Effect of Alternative Splitting Rules on Image Processing Using Classification Tree Analysis

Rule-based classification using classification tree analysis (CTA) is increasingly applied to remotely sensed data. CTA employs splitting rules to construct decision trees using training data as input. Results are then used for image classification. Software implementations of CTA offer different splitting rules and provide practitioners little guidance for their selection. We evaluated classification accuracy from four commonly used splitting rules and three types of imagery. Overall accuracies within data types varied less than 6 percent. Pairwise comparisons of kappa statistics indicated no significant differences (p-value � 0.05). Individual class accuracies, measured by user’s and producer’s accuracy, however, varied among methods. The entropy and twoing splitting rules most often accounted for the poorest performing classes. Based on analysis of the structure of the rules and the results from our three data sets, when the software provides the option, we recommend the gini and class probability rules for classification of remotely sensed data.

[1]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[2]  John R. Jensen,et al.  Introductory Digital Image Processing: A Remote Sensing Perspective , 1986 .

[3]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[4]  R. DeFries,et al.  Classification trees: an alternative to traditional land cover classifiers , 1996 .

[5]  Calvin L. Williams,et al.  Modern Applied Statistics with S-Plus , 1997 .

[6]  C. Brodley,et al.  Decision tree classification of land cover from remotely sensed data , 1997 .

[7]  C. Apte,et al.  Data mining with decision trees and decision rules , 1997, Future Gener. Comput. Syst..

[8]  Russell G. Congalton,et al.  Assessing the accuracy of remotely sensed data : principles and practices , 1998 .

[9]  J. R. Koehler,et al.  Modern Applied Statistics with S-Plus. , 1996 .

[10]  Roberta Siciliano,et al.  Multivariate data analysis and modeling through classification and regression trees , 2000 .

[11]  G. De’ath,et al.  CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS , 2000 .

[12]  R. Lawrence Rule-Based Classification Systems Using Classification and Regression Tree (CART) Analysis , 2001 .

[13]  R. Siciliano,et al.  A statistical approach to growing a reliable honest tree , 2002 .

[14]  Brian D. Ripley,et al.  Modern Applied Statistics with S Fourth edition , 2002 .

[15]  Shana Grace Driscoll Detecting and mapping leafy spurge (Euphorbia esula) and spotted knapweed (Centaurea maculosa) in rangeland ecosystems using airborne digital imagery , 2002 .

[16]  Paul M. Mather,et al.  An assessment of the effectiveness of decision tree methods for land cover classification , 2003 .

[17]  R. Lawrence,et al.  Early Detection of Douglas-Fir Beetle Infestation with Subcanopy Resolution Hyperspectral Imagery , 2003 .

[18]  Leo Breiman,et al.  Technical note: Some properties of splitting criteria , 2004, Machine Learning.

[19]  Rick L. Lawrence,et al.  Classification of remotely sensed imagery using stochastic gradient boosting as a refinement of classification tree analysis , 2004 .

[20]  L. Breiman Technical Note: Some Properties of Splitting Criteria , 1996, Machine Learning.