Classification of remotely sensed imagery using stochastic gradient boosting as a refinement of classification tree analysis

Classification tree analysis (CTA) provides an effective suite of algorithms for classifying remotely sensed data, but it has the limitations of (1) not searching for optimal tree structures and (2) being adversely affected by outliers, inaccurate training data, and unbalanced data sets. Stochastic gradient boosting (SGB) is a refinement of standard CTA that attempts to minimize these limitations by (1) using classification errors to iteratively refine the trees using a random sample of the training data and (2) combining the multiple trees iteratively developed to classify the data. We compared traditional CTA results to SGB for three remote sensing based data sets, an IKONOS image from the Sierra Nevada Mountains of California, a Probe-1 hyperspectral image from the Virginia City mining district of Montana, and a series of Landsat ETM+ images from the Greater Yellowstone Ecosystem (GYE). SGB improved the overall accuracy of the IKONOS classification from 84% to 95% and the Probe-1 classification from 83% to 93%. The worst performing classes using CTA exhibited the largest increases in class accuracy using SGB. A slight decrease in overall classification accuracy resulted from the SGB analysis of the Landsat data.

[1]  Rick L. Lawrence,et al.  FIFTEEN YEARS OF REVEGETATION OF MOUNT ST. HELENS: A LANDSCAPE-SCALE ANALYSIS , 2000 .

[2]  Alan H. Strahler,et al.  Maximizing land cover classification accuracies produced by decision trees at continental to global scales , 1999, IEEE Trans. Geosci. Remote. Sens..

[3]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[4]  R. Lawrence Rule-Based Classification Systems Using Classification and Regression Tree (CART) Analysis , 2001 .

[5]  C. Brodley,et al.  Decision tree classification of land cover from remotely sensed data , 1997 .

[6]  Jonathan Cheung-Wai Chan,et al.  Multiple Criteria for Evaluating Machine Learning Algorithms for Land Cover Classification from Satellite Data , 2000 .

[7]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[8]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[9]  Robert E. Schapire,et al.  Theoretical Views of Boosting and Applications , 1999, ALT.

[10]  D. Roberts,et al.  A comparison of methods for monitoring multitemporal vegetation change using Thematic Mapper imagery , 2002 .

[11]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[12]  Jude W. Shavlik,et al.  Machine Learning: Proceedings of the Fifteenth International Conference , 1998 .

[13]  Paul M. Mather,et al.  An assessment of the effectiveness of decision tree methods for land cover classification , 2003 .

[14]  R. DeFries,et al.  Classification trees: an alternative to traditional land cover classifiers , 1996 .

[15]  Lisa J. Graumlich,et al.  Topographic mediation of growth in high elevation foxtail pine (Pinus balfouriana Grev. et Balf.) forests in the Sierra Nevada, USA , 2005 .

[16]  R. Lawrence,et al.  Early Detection of Douglas-Fir Beetle Infestation with Subcanopy Resolution Hyperspectral Imagery , 2003 .

[17]  Shana Grace Driscoll Detecting and mapping leafy spurge (Euphorbia esula) and spotted knapweed (Centaurea maculosa) in rangeland ecosystems using airborne digital imagery , 2002 .

[18]  J. Friedman Stochastic gradient boosting , 2002 .

[19]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[20]  R. Colwell Remote sensing of the environment , 1980, Nature.

[21]  Russell G. Congalton,et al.  A review of assessing the accuracy of classifications of remotely sensed data , 1991 .

[22]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[23]  Jennifer A. Miller,et al.  Land-Cover Change Monitoring with Classification Trees Using Landsat TM and Ancillary Data , 2003 .