Land cover classification using random forest with genetic algorithm-based parameter optimization

Abstract. Land cover classification based on remote sensing imagery is an important means to monitor, evaluate, and manage land resources. However, it requires robust classification methods that allow accurate mapping of complex land cover categories. Random forest (RF) is a powerful machine-learning classifier that can be used in land remote sensing. However, two important parameters of RF classification, namely, the number of trees and the number of variables tried at each split, affect classification accuracy. Thus, optimal parameter selection is an inevitable problem in RF-based image classification. This study uses the genetic algorithm (GA) to optimize the two parameters of RF to produce optimal land cover classification accuracy. HJ-1B CCD2 image data are used to classify six different land cover categories in Changping, Beijing, China. Experimental results show that GA-RF can avoid arbitrariness in the selection of parameters. The experiments also compare land cover classification results by using GA-RF method, traditional RF method (with default parameters), and support vector machine method. When the GA-RF method is used, classification accuracies, respectively, improved by 1.02% and 6.64%. The comparison results show that GA-RF is a feasible solution for land cover classification without compromising accuracy or incurring excessive time.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Ruben Van De Kerchove,et al.  Monitoring grass nutrients and biomass as indicators of rangeland quality and quantity using random forest modelling and WorldView-2 data , 2015, Int. J. Appl. Earth Obs. Geoinformation.

[3]  Paul E. Gessler,et al.  Integrating Landsat TM and SRTM-DEM derived variables with decision trees for habitat classification and change detection in complex neotropical environments , 2008 .

[4]  M. Pal,et al.  Random forests for land cover classification , 2003, IGARSS 2003. 2003 IEEE International Geoscience and Remote Sensing Symposium. Proceedings (IEEE Cat. No.03CH37477).

[5]  Carla E. Brodley,et al.  An Incremental Method for Finding Multivariate Splits for Decision Trees , 1990, ML.

[6]  Yunming Ye,et al.  A Tree Selection Model for Improved Random Forest , 2011 .

[7]  William Nick Street,et al.  Ensemble Pruning Via Semi-definite Programming , 2006, J. Mach. Learn. Res..

[8]  Kelly Elder,et al.  Automatic Grain Type Classification of Snow Micro Penetrometer Signals With Random Forests , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[9]  Mario Chica-Olmo,et al.  An assessment of the effectiveness of a random forest classifier for land-cover classification , 2012 .

[10]  Bernard De Baets,et al.  Impact of Reducing Polarimetric SAR Input on the Uncertainty of Crop Classifications Based on the Random Forests Algorithm , 2012, IEEE Transactions on Geoscience and Remote Sensing.

[11]  Bryan C. Pijanowski,et al.  Modeling multiple land use changes using ANN, CART and MARS: Comparing tradeoffs in goodness of fit and explanatory power of data mining tools , 2014, Int. J. Appl. Earth Obs. Geoinformation.

[12]  André Stumpf,et al.  bject-oriented mapping of urban trees using Random Forest lassifiers , 2013 .

[13]  Yang Hongwe Distribution Information Extraction of Rubber Woods Using Remote Sensing Images with High Resolution , 2014 .

[14]  Jennifer A. Miller,et al.  Contextual land-cover classification: incorporating spatial dependence in land-cover classification models using random forests and the Getis statistic , 2010 .

[15]  Li Zhang,et al.  Sparse ensembles using weighted combination methods based on linear programming , 2011, Pattern Recognit..

[16]  Yi Li,et al.  Flood Mapping Based on Multiple Endmember Spectral Mixture Analysis and Random Forest Classifier - The Case of Yuyao, China , 2015, Remote. Sens..

[17]  Hao Wang,et al.  Water body mapping method with HJ-1A/B satellite imagery , 2011, Int. J. Appl. Earth Obs. Geoinformation.

[18]  S. Bharathidason,et al.  Improving Classification Accuracy based on Random Forest Model with Uncorrelated High Performing Trees , 2014 .

[19]  Peijun Du,et al.  Spectral–Spatial Classification for Hyperspectral Data Using Rotation Forests With Local Feature Extraction and Markov Random Fields , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[20]  Rick L. Lawrence,et al.  Mapping invasive plants using hyperspectral imagery and Breiman Cutler classifications (RandomForest) , 2006 .

[21]  Emmanuel John M. Carranza,et al.  Random forest predictive modeling of mineral prospectivity with small number of prospects and data with missing values in Abra (Philippines) , 2015, Comput. Geosci..

[22]  J. Alison Noble,et al.  Improving the Classification Accuracy of the Classic RF Method by Intelligent Feature Selection and Weighted Voting of Trees with Application to Medical Image Segmentation , 2011, MLMI.

[23]  G. Groom,et al.  Spatial application of Random Forest models for fine-scale coastal vegetation classification using object based analysis of aerial orthophoto and DEM data , 2015, Int. J. Appl. Earth Obs. Geoinformation.

[24]  Gerhard Tutz,et al.  Random forest for ordinal responses: Prediction and variable selection , 2016, Comput. Stat. Data Anal..

[25]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[26]  Ali Mansourian,et al.  Rational function optimization using genetic algorithms , 2007, Int. J. Appl. Earth Obs. Geoinformation.

[27]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[28]  Zhao Ying A study on environmental change analysis in Sand Hill of Nebraska using remote sensing , 2001 .

[29]  Jonathan Cheung-Wai Chan,et al.  Evaluation of random forest and adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery , 2008 .

[30]  A. Hudak,et al.  Mapping snags and understory shrubs for a LiDAR-based assessment of wildlife habitat suitability , 2009 .

[31]  Francisco Herrera,et al.  Monotonic Random Forest with an Ensemble Pruning Mechanism based on the Degree of Monotonicity , 2015, New Generation Computing.

[32]  Bryan C. Pijanowski,et al.  Comparing three global parametric and local non-parametric models to simulate land use change in diverse areas of the world , 2014, Environ. Model. Softw..

[33]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[34]  Fei Deng,et al.  Integration of orthoimagery and lidar data for object-based urban thematic mapping using random forests , 2013 .

[35]  Samia Boukir,et al.  Relevance of airborne lidar and multispectral image data for urban scene classification using Random Forests , 2011 .

[36]  Brian W. Barrett,et al.  Temporal optimisation of image acquisition for land cover classification with Random Forest and MODIS time-series , 2015, Int. J. Appl. Earth Obs. Geoinformation.

[37]  S. K. McFeeters The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features , 1996 .

[38]  Janet Franklin,et al.  Mapping land-cover modifications over large areas: A comparison of machine learning algorithms , 2008 .

[39]  Hamid Ebadi,et al.  Design and implementation of an algorithm for automatic 3D reconstruction of building models using genetic algorithm , 2012, Int. J. Appl. Earth Obs. Geoinformation.

[40]  Eric C. Grunsky,et al.  Predictive lithological mapping of Canada's North using Random Forest classification applied to geophysical and geochemical data , 2015, Comput. Geosci..

[41]  Grigorios Tsoumakas,et al.  An Ensemble Pruning Primer , 2009, Applications of Supervised and Unsupervised Ensemble Methods.

[42]  Jonathan Cheung-Wai Chan,et al.  Multiple Criteria for Evaluating Machine Learning Algorithms for Land Cover Classification from Satellite Data , 2000 .