Machine Learning Methods and Synthetic Data Generation to Predict Large Wildfires

Wildfires are becoming more frequent in different parts of the globe, and the ability to predict when and where they will occur is a complex process. Identifying wildfire events with high probability of becoming a large wildfire is an important task for supporting initial attack planning. Different methods, including those that are physics-based, statistical, and based on machine learning (ML) are used in wildfire analysis. Among the whole, those based on machine learning are relatively novel. In addition, because the number of wildfires is much greater than the number of large wildfires, the dataset to be used in a ML model is imbalanced, resulting in overfitting or underfitting the results. In this manuscript, we propose to generate synthetic data from variables of interest together with ML models for the prediction of large wildfires. Specifically, five synthetic data generation methods have been evaluated, and their results are analyzed with four ML methods. The results yield an improvement in the prediction power when synthetic data are used, offering a new method to be taken into account in Decision Support Systems (DSS) when managing wildfires.

[1]  Sarah McCaffrey,et al.  Defining Extreme Wildfire Events: Difficulties, Challenges, and Impacts , 2018 .

[2]  S. A. Lewis,et al.  The Relationship of Multispectral Satellite Imagery to Immediate Fire Effects , 2007 .

[3]  Wang Bing-Hong,et al.  Self-organized criticality of forest fire in China , 2001 .

[4]  Sachin S. Patil,et al.  Enhanced SMOTE algorithm for classification of imbalanced big-data using Random Forest , 2015, 2015 IEEE International Advance Computing Conference (IACC).

[5]  Weibin You,et al.  Geographical information system-based forest fire risk assessment integrating national forest inventory data and analysis of its spatiotemporal variability , 2017 .

[6]  Kim André Vanselow,et al.  Fire regimes at the arid fringe: A 16-year remote sensing perspective (2000–2016) on the controls of fire activity in Namibia from spatial predictive models , 2018, Ecological Indicators.

[7]  H. Pourghasemi GIS-based forest fire susceptibility mapping in Iran: a comparison between evidential belief function and binary logistic regression models , 2016 .

[8]  M. Finney FARSITE : Fire Area Simulator : model development and evaluation , 1998 .

[9]  Pablo Juan,et al.  Modeling fire size of wildfires in Castellon (Spain), using spatiotemporal marked point processes , 2016 .

[10]  S. Berberoglu,et al.  Mapping regional forest fire probability using artificial neural network model in a Mediterranean forest ecosystem , 2016 .

[11]  Hadi Sadoghi Yazdi,et al.  Online neural network model for non-stationary and imbalanced data stream classification , 2014, Int. J. Mach. Learn. Cybern..

[12]  Yang Chen,et al.  Needle in a haystack: Mapping rare and infrequent crops using satellite imagery and data balancing methods , 2019, Remote Sensing of Environment.

[13]  S. Coles,et al.  An Introduction to Statistical Modeling of Extreme Values , 2001 .

[14]  Matthew P. Thompson,et al.  Advancing effects analysis for integrated, large-scale wildfire risk assessment , 2011, Environmental monitoring and assessment.

[15]  R. Rothermel A Mathematical Model for Predicting Fire Spread in Wildland Fuels , 2017 .

[16]  C. E. Van Wagner,et al.  Prediction of crown fire behavior in two stands of jack pine , 1993 .

[17]  M. Moritz,et al.  Large wildfire trends in the western United States, 1984–2011 , 2014 .

[18]  O. Sass,et al.  Using multi variate data mining techniques for estimating fire susceptibility of Tyrolean forests , 2014 .

[19]  X. Úbeda,et al.  After the Wildfires: The Processes of Social Learning of Forest Owners’ Associations in Central Catalonia, Spain , 2020, Sustainability.

[20]  M. R. Brown,et al.  Neural network and GA approaches for dwelling fire occurrence prediction , 2006, Knowl. Based Syst..

[21]  He Sanwei Fire Spreading Model Based on CA Scope , 2011 .

[22]  Morton E. O'Kelly,et al.  Locating Emergency Warning Sirens , 1992 .

[23]  Prem Chandra Pandey,et al.  Fuzzy AHP for forest fire risk modeling , 2012 .

[24]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[25]  I. Tomek,et al.  Two Modifications of CNN , 1976 .

[26]  J. Franklin,et al.  Landscape Patterns and Legacies Resulting from Large, Infrequent Forest Disturbances , 1998, Ecosystems.

[27]  Robert Mavsar,et al.  Analysis of factors influencing deployment of fire suppression resources in Spain using artificial neural networks , 2016 .

[28]  R. Sukumar,et al.  Improving prediction and assessment of global fires using multilayer neural networks , 2021, Scientific Reports.

[29]  J. A. Schell,et al.  Monitoring vegetation systems in the great plains with ERTS , 1973 .

[30]  José A. Sobrino,et al.  Relationship between Soil Burn Severity in Forest Fires Measured In Situ and through Spectral Indices of Remote Detection , 2019, Forests.

[31]  Lei Wang,et al.  Effective prediction of three common diseases by combining SMOTE with Tomek links technique for imbalanced medical data , 2016, 2016 IEEE International Conference of Online Analysis and Computing Science (ICOACS).

[32]  Stuart Coles,et al.  Classical Extreme Value Theory and Models , 2001 .

[33]  G. Milne,et al.  Overview of bushre spread simulation systems , 2005 .

[34]  Fermín J. Alcasena,et al.  Evaluating alternative fuel treatment strategies to reduce wildfire losses in a Mediterranean area , 2016 .

[35]  D. Roy,et al.  An overview of MODIS Land data processing and product status , 2002 .

[36]  Biswajeet Pradhan,et al.  A hybrid artificial intelligence approach using GIS-based neural-fuzzy inference system and particle swarm optimization for forest fire susceptibility modeling at a tropical area , 2017 .

[37]  Fermín J. Alcasena,et al.  Modeling initial attack success of wildfire suppression in Catalonia, Spain. , 2019, The Science of the total environment.

[38]  Dieu Tien Bui,et al.  Tropical Forest Fire Susceptibility Mapping at the Cat Ba National Park Area, Hai Phong City, Vietnam, Using GIS-Based Kernel Logistic Regression , 2016, Remote. Sens..

[39]  José M. C. Pereira,et al.  Atmospheric conditions associated with the exceptional fire season of 2003 in Portugal , 2006 .

[40]  Michael C. Wimberly,et al.  Estimation of wildfire size and risk changes due to fuels treatments , 2012 .

[41]  Mark A. Finney,et al.  On the need for a theory of wildland fire spread , 2013 .

[42]  Avi Bar Massada,et al.  Wildfire ignition-distribution modelling: a comparative study in the Huron-Manistee National Forest, Michigan, USA , 2013 .

[43]  David P. Williams,et al.  Mine Classification With Imbalanced Data , 2009, IEEE Geoscience and Remote Sensing Letters.

[44]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[45]  Mohammad Khalilia,et al.  Predicting disease risks from highly imbalanced data using random forest , 2011, BMC Medical Informatics Decis. Mak..

[46]  Suzana Dragicevic,et al.  Design and implementation of an integrated GIS-based cellular automata model to characterize forest fire behaviour , 2008 .

[47]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[48]  J. Randerson,et al.  Analysis of daily, monthly, and annual burned area using the fourth‐generation global fire emissions database (GFED4) , 2013 .

[49]  Ioannis Giannikos,et al.  Towards an integrated framework for forest fire control , 2004, Eur. J. Oper. Res..

[50]  Thomas J. Duff,et al.  Using discrete event simulation cellular automata models to determine multi-mode travel times and routes of terrestrial suppression resources to wildland fires , 2015, Eur. J. Oper. Res..

[51]  R. Keane,et al.  Revisiting Wildland Fire Fuel Quantification Methods: The Challenge of Understanding a Dynamic, Biotic Entity , 2017 .

[52]  P. Fearnside,et al.  Forest fire risk indices and zoning of hazardous areas in Sorocaba, São Paulo state, Brazil , 2019, Journal of Forestry Research.

[53]  Adam L. Mahood,et al.  Spatiotemporal prediction of wildfire size extremes with Bayesian finite sample maxima , 2019, Ecological applications : a publication of the Ecological Society of America.

[54]  Mahesh Pal,et al.  Random forest classifier for remote sensing classification , 2005 .

[55]  P. Fernandes Forest fires in Galicia (Spain): The outcome of unbalanced fire management , 2008 .

[56]  Jeff Campbell,et al.  FIRE! Using GIS to Predict Fire Behavior , 1995 .

[57]  Stan Boutin,et al.  Empirical models of forest fire initial attack success probabilities : the effects of fuels, anthropogenic linear features, fire weather, and management , 2006 .

[58]  Christopher A Williams,et al.  Africa and the global carbon cycle , 2007, Carbon balance and management.

[59]  L. Forbes A two-dimensional model for large-scale bushfire spread , 1997, The Journal of the Australian Mathematical Society. Series B. Applied Mathematics.

[60]  J. Dupuy,et al.  Fire spread through a porous forest fuel bed: a radiative and convective model including fire-induced flow effects , 1999 .

[61]  Jesús San-Miguel-Ayanz,et al.  Estimating future burned areas under changing climate in the EU-Mediterranean countries. , 2013, The Science of the total environment.

[62]  Qianlai Zhuang,et al.  Extreme value analysis of wildfires in Canadian boreal forest ecosystems , 2011 .

[63]  John R. Coleman,et al.  A Fire Perimeter Expansion Algorithm-Based on Huygens Wavelet Propagation , 1993 .

[64]  Panel Intergubernamental sobre Cambio Climático Climate change 2007: Synthesis report , 2007 .

[65]  A. Gill,et al.  Landscape fires as social disasters: An overview of ‘the bushfire problem’ , 2005 .

[66]  John R. Coleman,et al.  A real-time computer application for the prediction of fire spread across the Australian landscape , 1996, Simul..

[67]  James P. Minas,et al.  A mixed integer programming approach for asset protection during escaped wildfires , 2015 .

[68]  Ascensión Hernández Encinas,et al.  Simulation of forest fire fronts using cellular automata , 2007, Adv. Eng. Softw..

[69]  M. Flannigan,et al.  Future Area Burned in Canada , 2005 .

[70]  Ismael Vallejo-Villalta,et al.  Mapping Forest Fire Risk at a Local Scale—A Case Study in Andalusia (Spain) , 2019, Environments.

[71]  Li Cunbin,et al.  Analysis of Forest Fire Spread Trend Surrounding Transmission Line Based on Rothermel Model and Huygens Principle , 2014, MUE 2014.

[72]  M. A. H. Farquad,et al.  Preprocessing unbalanced data using support vector machine , 2012, Decis. Support Syst..

[73]  Peter W. Eklund A distributed spatial architecture for bush fire simulation , 2001, Int. J. Geogr. Inf. Sci..

[74]  Jesse D. Young,et al.  Climate relationships with increasing wildfire in the southwestern US from 1984 to 2015 , 2020 .

[75]  Dan Malkinson,et al.  Spatio-temporal perspectives of forest fires regimes in a maturing Mediterranean mixed pine landscape , 2009, European Journal of Forest Research.

[76]  Robert G. Haight,et al.  Deploying Wildland Fire Suppression Resources with a Scenario-Based Standard Response Model , 2007, INFOR Inf. Syst. Oper. Res..

[77]  B. M. Wotton,et al.  The use of fractal dimension to improve wildland fire perimeter predictions , 1993 .

[78]  Corinne Lampin,et al.  A Review of the Main Driving Factors of Forest Fire Ignition Over Europe , 2013, Environmental Management.

[79]  P. Bermudez,et al.  Spatial and temporal extremes of wildfire sizes in Portugal (1984–2004) , 2009 .

[80]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[81]  B. M. Wotton,et al.  Climate Change and People-Caused Forest Fire Occurrence in Ontario , 2003 .

[82]  Nick Reid,et al.  Forgotten fauna: Habitat attributes of long-unburnt open forests and woodlands dictate a rethink of fire management theory and practice , 2016 .

[83]  J. Abatzoglou,et al.  Modeling very large-fire occurrences over the continental United States from weather and climate forcing , 2014 .

[84]  A. Fernández,et al.  Temporal evolution of the NDVI as an indicator of forest fire danger , 1996 .

[85]  A. P. Dimitrakopoulos,et al.  Evaluation of the Canadian fire weather index system in an eastern Mediterranean environment , 2011 .

[86]  J. Pereira,et al.  Modeling spatial patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random Forest , 2012 .

[87]  Glenn P. Forney,et al.  Fire dynamics simulator- technical reference guide , 2000 .

[88]  Nikos Passas,et al.  A decision support system for managing forest fire casualties. , 2007, Journal of environmental management.

[89]  Martijn Gough Climate change , 2009, Canadian Medical Association Journal.

[90]  Alan A. Ager,et al.  Wildfire risk estimation in the Mediterranean area , 2014 .

[91]  K. Reynolds,et al.  Strategic and tactical planning to improve suppression efforts against large forest fires in the Catalonia region of Spain , 2019, Forest Ecology and Management.

[92]  Robert G. Haight,et al.  Deploying initial attack resources for wildfire suppression: spatial coordination, budget constraints, and capacity constraints , 2013 .

[93]  Ana Carolina de Albuquerque Santos,et al.  Fire danger index efficiency as a function of fuel moisture and fire behavior. , 2018, The Science of the total environment.

[94]  T. Penman,et al.  Fuel moisture in Mountain Ash forests with contrasting fire histories , 2017 .

[95]  Yu Wei,et al.  A Chance-Constrained Programming Model to Allocate Wildfire Initial Attack Resources for a Fire Season , 2015 .

[96]  Yanqing Zhang,et al.  SVMs Modeling for Highly Imbalanced Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[97]  Richard Simon,et al.  Overfitting in prediction models - is it a problem only in high dimensions? , 2013, Contemporary clinical trials.

[98]  Hanqin Tian,et al.  Spatial and temporal patterns of carbon emissions from forest fires in China from 1950 to 2000 , 2006 .

[99]  Carlo Ricotta,et al.  Using Monte Carlo simulations to estimate relative fire ignition danger in a low-to-medium fire-prone region , 2011 .

[100]  Krishna Prasad Vadrevu,et al.  Fire risk evaluation using multicriteria analysis—a case study , 2010, Environmental monitoring and assessment.

[101]  M. Parisien,et al.  Considerations for modeling burn probability across landscapes with steep environmental gradients: an example from the Columbia Mountains, Canada , 2013, Natural Hazards.

[102]  A. M. G. Lopes,et al.  FireStation - an integrated software system for the numerical simulation of fire spread on complex topography , 2002, Environ. Model. Softw..

[103]  Geoffrey H. Donovan,et al.  Incentives and Wildfire Management in the United States , 2008 .

[104]  Ioannis G. Karafyllidis,et al.  A model for predicting forest fire spreading using cellular automata , 1997 .

[105]  François de Coligny,et al.  Modeling fuels and fire effects in 3D: Model description and applications , 2016, Environ. Model. Softw..

[106]  Jiaqiu Wang,et al.  Integrated Spatio‐temporal Data Mining for Forest Fire Prediction , 2008, Trans. GIS.