A novel ensemble modeling approach for the spatial prediction of tropical forest fire susceptibility using LogitBoost machine learning classifier and multi-source geospatial data

A reliable forest fire susceptibility map is a necessity for disaster management and a primary reference source in land use planning. We set out to evaluate the use of the LogitBoost ensemble-based decision tree (LEDT) machine learning method for forest fire susceptibility mapping through a comparative case study at the Lao Cai region of Vietnam. A thorough literature search would indicate the method has not previously been applied to forest fires. Support vector machine (SVM), random forest (RF), and Kernel logistic regression (KLR) were used as benchmarks in the comparative evaluation. A fire inventory database for the study area was constructed based on data of previous forest fire occurrences, and related conditioning factors were generated from a number of sources. Thereafter, forest fire probability indices were computed through each of the four modeling techniques, and performances were compared using the area under the curve (AUC), Kappa index, overall accuracy, specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV). The LEDT model produced the best performance, both on the training and on validation datasets, demonstrating a 92% prediction capability. Its overall superiority over the benchmarking models suggests that it has the potential to be used as an efficient new tool for forest fire susceptibility mapping. Fire prevention is a critical concern for local forestry authorities in tropical Lao Cai region, and based on the evidence of our study, the method has a potential application in forestry conservation management.

[1]  Lalit Kumar,et al.  Review of native vegetation condition assessment concepts, methods and future trends , 2017 .

[2]  B. Pradhan,et al.  Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed, Iran , 2012, Natural Hazards.

[3]  Mats Niklasson,et al.  Forest fire activity in Sweden: Climatic controls and geographical patterns in 20th century , 2012 .

[4]  M. G. Ryan,et al.  Continued warming could transform Greater Yellowstone fire regimes by mid-21st century , 2011, Proceedings of the National Academy of Sciences.

[5]  Juli G Pausas,et al.  Fire benefits flower beetles in a Mediterranean ecosystem , 2018, PloS one.

[6]  Hongquan Xie,et al.  Forest fire risk zone evaluation based on high spatial resolution RS image in Liangyungang Huaguo Mountain Scenic Spot , 2011, Proceedings 2011 IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services.

[7]  L. Ungar,et al.  MediBoost: a Patient Stratification Tool for Interpretable Decision Making in the Era of Precision Medicine , 2016, Scientific Reports.

[8]  S. Mukherjee,et al.  Forest fire risk zone mapping from satellite imagery and GIS , 2002 .

[9]  J. Zêzere,et al.  Assessment and validation of wildfire susceptibility and hazard in Portugal , 2009 .

[10]  Dieu Tien Bui,et al.  Application of support vector machines in landslide susceptibility assessment for the Hoa Binh province (Vietnam) with kernel functions analysis , 2012 .

[11]  Wenjia Wang,et al.  Investigation on Diversity in Homogeneous and Heterogeneous Ensembles , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[12]  B. Pradhan,et al.  A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility , 2017 .

[13]  D. Bui,et al.  Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees , 2018 .

[14]  Mustafa Neamah Jebur,et al.  Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS , 2014 .

[15]  Biswajeet Pradhan,et al.  Forest fire susceptibility and risk mapping using remote sensing and geographical information systems (GIS) , 2007 .

[16]  Yohay Carmel,et al.  Assessing fire risk using Monte Carlo simulations of fire spread , 2009 .

[17]  C. Nock,et al.  Forest fire occurrence and climate change in Canada , 2010 .

[18]  Douglas G. Woolford,et al.  A model for predicting human-caused wildfire occurrence in the region of Madrid, Spain , 2010 .

[19]  Nataliia Kussul,et al.  Efficiency Assessment of Multitemporal C-Band Radarsat-2 Intensity and Landsat-8 Surface Reflectance Satellite Imagery for Crop Classification in Ukraine , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[20]  Isik Yilmaz,et al.  Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: A case study from Kat landslides (Tokat - Turkey) , 2009, Comput. Geosci..

[21]  Nguyen Quoc Thanh,et al.  Spatial prediction of rainfall-induced landslides for the Lao Cai area (Vietnam) using a hybrid intelligent approach of least squares support vector machines inference model and artificial bee colony optimization , 2017, Landslides.

[22]  Andrew F. Bennett,et al.  The effects of topographic variation and the fire regime on coarse woody debris: Insights from a large wildfire , 2015 .

[23]  Biswajeet Pradhan,et al.  A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS , 2013, Comput. Geosci..

[24]  Youmin Zhang,et al.  A survey on technologies for automatic forest fire monitoring, detection, and fighting using unmanned aerial vehicles and remote sensing techniques , 2015 .

[25]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Víctor Urrea,et al.  Letter to the Editor: Stability of Random Forest importance measures , 2011, Briefings Bioinform..

[27]  David Haussler,et al.  Probabilistic kernel regression models , 1999, AISTATS.

[28]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[29]  Marcel Dettling,et al.  BagBoosting for tumor classification with gene expression data , 2004, Bioinform..

[30]  Lalit Kumar,et al.  The greening of the Himalayas and Tibetan Plateau under climate change , 2017 .

[31]  Mahyat Shafapour Tehrany,et al.  Flood susceptibility assessment using GIS-based support vector machine model with different kernel types , 2015 .

[32]  Grant J. Williamson,et al.  Climate-induced variations in global wildfire danger from 1979 to 2013 , 2015, Nature Communications.

[33]  J. Greenberg,et al.  Spatial variability in wildfire probability across the western United States , 2012 .

[34]  Mir Abolfazl Mostafavi,et al.  Calibration of FARSITE fire area simulator in Iranian northern forests , 2014 .

[35]  D. Edwards Data Mining: Concepts, Models, Methods, and Algorithms , 2003 .

[36]  D. Nepstad,et al.  Interactions among Amazon land use, forests and climate: prospects for a near-term forest tipping point , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[37]  Lalit Kumar,et al.  Climate Modelling Shows Increased Risk to Eucalyptus sideroxylon on the Eastern Coast of Australia Compared to Eucalyptus albens , 2017, Plants.

[38]  M. Lechowicz,et al.  Post-fire succession of collembolan communities in a northern hardwood forest , 2012 .

[39]  Biswajeet Pradhan,et al.  A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping , 2014, Landslides.

[40]  B. Pradhan,et al.  GIS-based modeling of rainfall-induced landslides using data mining-based functional trees classifier with AdaBoost, Bagging, and MultiBoost ensemble frameworks , 2016, Environmental Earth Sciences.

[41]  Rémi Gilleron,et al.  Learning Multi-label Alternating Decision Trees from Texts and Data , 2003, MLDM.

[42]  K. Chou,et al.  Using LogitBoost classifier to predict protein structural classes. , 2006, Journal of theoretical biology.

[43]  Nicolás García-Pedrajas,et al.  Random feature weights for decision tree ensemble construction , 2012, Inf. Fusion.

[44]  B. Pham,et al.  Bagging based Support Vector Machines for spatial prediction of landslides , 2018, Environmental Earth Sciences.

[45]  Nicholas C. Coops,et al.  Detecting forest damage after a low-severity fire using remote sensing at multiple scales , 2015, Int. J. Appl. Earth Obs. Geoinformation.

[46]  Anuradha Eaturu,et al.  Biophysical and anthropogenic controls of forest fires in the Deccan Plateau, India. , 2008, Journal of environmental management.

[47]  Mariana Belgiu,et al.  Random forest in remote sensing: A review of applications and future directions , 2016 .

[48]  B. Pradhan,et al.  Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines , 2015 .

[49]  Long Sun,et al.  Wildfire ignition in the forests of southeast China: Identifying drivers and spatial distribution to predict wildfire likelihood , 2016 .

[50]  B. Pradhan,et al.  Regional prediction of landslide hazard using probability analysis of intense rainfall in the Hoa Binh province, Vietnam , 2013, Natural Hazards.

[51]  Mikhail Kanevski,et al.  Machine Learning Feature Selection Methods for Landslide Susceptibility Mapping , 2013, Mathematical Geosciences.

[52]  B. Pradhan,et al.  Comparison of four kernel functions used in support vector machines for landslide susceptibility mapping: a case study at Suichuan area (China) , 2017 .

[53]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[54]  Simon D. Jones,et al.  Development of a Multi-Spatial Resolution Approach to the Surveillance of Active Fire Lines Using Himawari-8 , 2016, Remote. Sens..

[55]  Abdul Halim Ghazali,et al.  Ensemble machine-learning-based geospatial approach for flood risk assessment using multi-sensor remote-sensing data and GIS , 2017 .

[56]  C. Justice,et al.  Vegetation fires and air pollution in Vietnam. , 2014, Environmental pollution.

[57]  Robert J. McGaughey,et al.  Mixed severity fire effects within the Rim fire: Relative importance of local climate, fire weather, topography, and forest structure , 2015 .

[58]  Lalit Kumar,et al.  Suitable areas of Phakopsora pachyrhizi, Spodoptera exigua, and their host plant Phaseolus vulgaris are projected to reduce and shift due to climate change , 2019, Theoretical and Applied Climatology.

[59]  Hamid Reza Pourghasemi,et al.  A comparative assessment of prediction capabilities of modified analytical hierarchy process (M-AHP) and Mamdani fuzzy logic models using Netcad-GIS for forest fire susceptibility mapping , 2016 .

[60]  Lia Duarte,et al.  Forest fire risk maps: a GIS open source application – a case study in Norwest of Portugal , 2013, Int. J. Geogr. Inf. Sci..

[61]  B. Pradhan,et al.  Landslide Susceptibility Assessment in Vietnam Using Support Vector Machines, Decision Tree, and Naïve Bayes Models , 2012 .

[62]  Ashok N. Srivastava,et al.  Data Mining: Concepts, Models, Methods, and Algorithms , 2005, J. Comput. Inf. Sci. Eng..

[63]  J. Abatzoglou,et al.  The Changing Strength and Nature of Fire-Climate Relationships in the Northern Rocky Mountains, U.S.A., 1902-2008 , 2015, PloS one.

[64]  A. Jaafari,et al.  Spatial prediction of wildfire probability in the Hyrcanian ecoregion using evidential belief function model and GIS , 2018, International Journal of Environmental Science and Technology.

[65]  Phan Trong Trinh,et al.  Late Quaternary tectonics and seismotectonics along the Red River fault zone, North Vietnam , 2012 .

[66]  J. Pereira,et al.  Modeling spatial patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random Forest , 2012 .

[67]  Saro Lee,et al.  Ensemble-based landslide susceptibility maps in Jinbu area, Korea , 2012, Environmental Earth Sciences.

[68]  C. Chung,et al.  Predicting landslides for risk analysis — Spatial models tested by a cross-validation technique , 2008 .

[69]  L. Kumar,et al.  Global risk levels for corn rusts (Puccinia sorghi and Puccinia polysora) under climate change projections , 2017 .

[70]  Biswajeet Pradhan,et al.  Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree , 2016, Landslides.

[71]  M. Marschalko,et al.  Landslide susceptibility assessment of the Kraľovany–Liptovský Mikuláš railway case study , 2010 .

[72]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[73]  Bernhard Schölkopf,et al.  Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[74]  Dieu Tien Bui,et al.  Improving Accuracy Estimation of Forest Aboveground Biomass Based on Incorporation of ALOS-2 PALSAR-2 and Sentinel-2A Imagery and Machine Learning: A Case Study of the Hyrcanian Forest Area (Iran) , 2018, Remote. Sens..

[75]  Jonathan Cheung-Wai Chan,et al.  Evaluation of random forest and adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery , 2008 .

[76]  Nhat-Duc Hoang,et al.  GIS-based spatial prediction of tropical forest fire danger using a new hybrid machine learning method , 2018, Ecol. Informatics.

[77]  K. I. Ramachandran,et al.  Feature selection using Decision Tree and classification through Proximal Support Vector Machine for fault diagnostics of roller bearing , 2007 .

[78]  J. Agee,et al.  Reform forest fire management , 2015, Science.

[79]  Wei Chen,et al.  Land Subsidence Susceptibility Mapping in South Korea Using Machine Learning Algorithms , 2018, Sensors.

[80]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[81]  Fausto Guzzetti,et al.  Geographical Information Systems in Assessing Natural Hazards , 2010 .

[82]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[83]  George P. Petropoulos,et al.  A Comparison of Spectral Angle Mapper and Artificial Neural Network Classifiers Combined with Landsat TM Imagery Analysis for Obtaining Burnt Area Mapping , 2010, Sensors.

[84]  Hamid Reza Pourghasemi,et al.  A comparative assessment between linear and quadratic discriminant analyses (LDA-QDA) with frequency ratio and weights-of-evidence models for forest fire susceptibility mapping in China , 2017, Arabian Journal of Geosciences.

[85]  H. Pourghasemi GIS-based forest fire susceptibility mapping in Iran: a comparison between evidential belief function and binary logistic regression models , 2016 .

[86]  Francisco Martínez-Álvarez,et al.  Determining the best set of seismicity indicators to predict earthquakes. Two case studies: Chile and the Iberian Peninsula , 2013, Knowl. Based Syst..

[87]  B. Pham,et al.  A Comparative Study of Least Square Support Vector Machines and Multiclass Alternating Decision Trees for Spatial Prediction of Rainfall-Induced Landslides in a Tropical Cyclones Area , 2016, Geotechnical and Geological Engineering.

[88]  Luke Wallace,et al.  ASSESSMENT OF THE UTILITY OF THE ADVANCED HIMAWARI IMAGER TO DETECT ACTIVE FIRE OVER AUSTRALIA , 2016 .

[89]  Pat Langley,et al.  Induction of One-Level Decision Trees , 1992, ML.

[90]  Lluís Brotons,et al.  Identifying location and causality of fire ignition hotspots in a Mediterranean region , 2012 .

[91]  Bin Gu,et al.  Cost-sensitive learning for defect escalation , 2014, Knowl. Based Syst..

[92]  P. Cortez,et al.  A data mining approach to predict forest fires using meteorological data , 2007 .

[93]  E. Chuvieco,et al.  Development of a framework for fire risk assessment using remote sensing and geographic information system technologies , 2010 .

[94]  Dieu Tien Bui,et al.  Tropical Forest Fire Susceptibility Mapping at the Cat Ba National Park Area, Hai Phong City, Vietnam, Using GIS-Based Kernel Logistic Regression , 2016, Remote. Sens..

[95]  Peter Bühlmann,et al.  Boosting for Tumor Classification with Gene Expression Data , 2003, Bioinform..

[96]  T. Kavzoglu,et al.  Assessment of shallow landslide susceptibility using artificial neural networks in Jabonosa River Basin, Venezuela , 2005 .

[97]  R. Tateishi,et al.  Evaluating urban expansion and land use change in Shijiazhuang, China, by using GIS and remote sensing , 2006 .

[98]  D. Riaño,et al.  Combining NDVI and surface temperature for the estimation of live fuel moisture content in forest fire danger rating , 2004 .

[99]  M. Marjanović,et al.  Landslide susceptibility assessment using SVM machine learning algorithm , 2011 .

[100]  Mustafa Neamah Jebur,et al.  Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS , 2013 .

[101]  L. Ayalew,et al.  Landslide susceptibility mapping using GIS-based weighted linear combination, the case in Tsugawa area of Agano River, Niigata Prefecture, Japan , 2004 .

[102]  Mustafa Neamah Jebur,et al.  Flood susceptibility mapping using integrated bivariate and multivariate statistical models , 2014, Environmental Earth Sciences.

[103]  Zohre Sadat Pourtaghi,et al.  Forest fire susceptibility mapping in the Minudasht forests, Golestan province, Iran , 2015, Environmental Earth Sciences.

[104]  Dieu Tien Bui,et al.  Spatial pattern assessment of tropical forest fire danger at Thuan Chau area (Vietnam) using GIS-based advanced machine learning algorithms: A comparative study , 2018, Ecol. Informatics.

[105]  Simon Jones,et al.  Evaluating the variations in the flood susceptibility maps accuracies due to the alterations in the type and extend of the flood inventory , 2017 .

[106]  Saro Lee,et al.  Enhancing Prediction Performance of Landslide Susceptibility Model Using Hybrid Machine Learning Approach of Bagging Ensemble and Logistic Model Tree , 2018, Applied Sciences.

[107]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[108]  A. R. Mahmud,et al.  GIS‐grid‐based and multi‐criteria analysis for identifying and mapping peat swamp forest fire hazard in Pahang, Malaysia , 2004 .

[109]  Florian Pappenberger,et al.  Ensemble flood forecasting: a review. , 2009 .

[110]  Avi Bar Massada,et al.  Wildfire ignition-distribution modelling: a comparative study in the Huron-Manistee National Forest, Michigan, USA , 2013 .

[111]  Simon D. Jones,et al.  An Assessment of Pre- and Post Fire Near Surface Fuel Hazard in an Australian Dry Sclerophyll Forest Using Point Cloud Data Captured Using a Terrestrial Laser Scanner , 2016, Remote. Sens..

[112]  Haifeng Lin,et al.  A fuzzy inference and big data analysis algorithm for the prediction of forest fire based on rechargeable wireless sensor networks , 2017, Sustain. Comput. Informatics Syst..

[113]  Stanley Lemeshow,et al.  Applied Logistic Regression, Second Edition , 1989 .

[114]  Maria Petrou,et al.  Machine Learning and Data Mining in Pattern Recognition , 2018, Lecture Notes in Computer Science.

[115]  Douglas Sheil,et al.  Convergence of bark investment according to fire and climate structures ecosystem vulnerability to future change. , 2017, Ecology letters.

[116]  K. Solaimani,et al.  Modeling forest fire risk in the northeast of Iran using remote sensing and GIS techniques , 2012, Natural Hazards.

[117]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[118]  Veronica Tofani,et al.  Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues , 2013 .

[119]  Sotiris B. Kotsiantis,et al.  Machine learning: a review of classification and combining techniques , 2006, Artificial Intelligence Review.