Modelling the Spatial Distribution of Asbestos—Cement Products in Poland with the Use of the Random Forest Algorithm

The unique set of physical and chemical properties of asbestos has led to its many industrial applications worldwide, of which roofing and facades constitute approximately 80% of currently used asbestos-containing products. Since asbestos-containing products are harmful to human health, their use and production have been banned in many countries. To date, no research has been undertaken to estimate the total amount of asbestos–cement products used at the country level in relation to regions or other administrative units. The objective of this paper is to present a possible new solution for developing the spatial distribution of asbestos–cement products used across the country by applying the supervised machine learning algorithm, i.e., Random Forest. Based on the results of a physical inventory taken on asbestos–cement products with the use of aerial imagery, and the application of selected features, considering the socio-economic situation of Poland, i.e., population, buildings, public finance, housing economy and municipal infrastructure, wages, salaries and social security benefits, agricultural census, entities of the national economy, labor market, environment protection, area of built-up surfaces, historical belonging to annexations, and data on asbestos manufacturing plants, best Random Forest models were computed. The selection of important variables was made in the R v.3.1.0 program and supported by the Boruta algorithm. The prediction of the amount of asbestos–cement products used in communes was executed in the randomForest package. An algorithm explaining 75.85% of the variance was subsequently used to prepare the prediction map of the spatial distribution of the amount of asbestos–cement products used in Poland. The total amount was estimated at 710,278,645 m2 (7.8 million tons). Since the best model used data on built-up surfaces which are available for the whole of Europe, it is worth considering the use of the developed method in other European countries, as well as to assess the environmental risk of asbestos exposure to humans.

[1]  P. Moran Notes on continuous stochastic phenomena. , 1950, Biometrika.

[2]  J. Peto,et al.  Continuing increase in mesothelioma mortality in Britain , 1995, The Lancet.

[3]  K. Zimmermann,et al.  PSEUDO‐R2 MEASURES FOR SOME COMMON LIMITED DEPENDENT VARIABLE MODELS , 1996 .

[4]  K. Kelsey,et al.  The molecular epidemiology of asbestos and tobacco in lung cancer , 2002, Oncogene.

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  A. Prasad,et al.  Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction , 2006, Ecosystems.

[7]  L. Beckett,et al.  Residential proximity to naturally occurring asbestos and mesothelioma risk in California. , 2005, American journal of respiratory and critical care medicine.

[8]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[9]  Hongyu Zhao,et al.  Pathway analysis using random forests classification and regression , 2006, Bioinform..

[10]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[11]  Christoph Lehmann,et al.  Application and comparison of classification algorithms for recognition of Alzheimer's disease in electrical brain activity (EEG) , 2007, Journal of Neuroscience Methods.

[12]  Catherine A. Calder,et al.  Beyond Moran's I: Testing for Spatial Dependence Based on the Spatial Autoregressive Model , 2007 .

[13]  J. Peto,et al.  Occupational, domestic and environmental mesothelioma risks in the British population: a case–control study , 2009, British Journal of Cancer.

[14]  Witold R. Rudnicki,et al.  Boruta - A System for Feature Selection , 2010, Fundam. Informaticae.

[15]  M. Caley,et al.  Global Patterns and Predictions of Seafloor Biomass Using Random Forests , 2010, PloS one.

[16]  Witold R. Rudnicki,et al.  Feature Selection with the Boruta Package , 2010 .

[17]  S. Vincenzi,et al.  Application of a Random Forest algorithm to predict spatial distribution of the potential yield of Ruditapes philippinarum in the Venice lagoon, Italy , 2011 .

[18]  Calvin A. Farris,et al.  Use of random forests for modeling and mapping forest canopy fuels for fire behavior analysis in Lassen Volcanic National Park, California, USA , 2012 .

[19]  J. Pereira,et al.  Modeling spatial patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random Forest , 2012 .

[20]  Jörg Müller,et al.  Modelling Forest α-Diversity and Floristic Composition - On the Added Value of LiDAR plus Hyperspectral Remote Sensing , 2012, Remote. Sens..

[21]  P. Mędrzycki,et al.  Determinants of badger Meles meles sett location in Białowieża Primeval Forest, northeastern Poland , 2013 .

[22]  Aerial imagery and geographic information systems used in the asbestos removal process in Poland , 2013 .

[23]  Bogdan Zagajewski,et al.  Asbestos manufacturing plants in Poland , 2014 .

[24]  P. Deboosere,et al.  Asbestos in Belgium: an underestimated health risk. The evolution of mesothelioma mortality rates (1969–2009) , 2014, International journal of occupational and environmental health.

[25]  Bogdan Zagajewski,et al.  The Electronic Spatial Information System – tools for the monitoring of asbestos in Poland , 2014 .

[26]  Wieslaw Paja,et al.  All Relevant Feature Selection Methods and Applications , 2015, Feature Selection for Data and Pattern Recognition.

[27]  M. Krówczyńska,et al.  Determinants influencing the amount of asbestos-cement roofing in Poland , 2015 .

[28]  Michel Ballings,et al.  Evaluating multiple classifiers for stock price direction prediction , 2015, Expert Syst. Appl..

[29]  Ewa Wilk,et al.  GeoAzbest – serwis do monitorowania procesu usuwania wyrobów azbestowych , 2016 .

[30]  M. Szołtysek,et al.  Mosaic: recovering surviving census records and reconstructing the familial history of Europe , 2016 .

[31]  Ewa Wilk,et al.  Estimation of the amount of asbestos-cement roofing in Poland , 2017, Waste management & research : the journal of the International Solid Wastes and Public Cleansing Association, ISWA.

[32]  Mikołaj Szołtysek a,et al.  Mosaic: recovering surviving census records and reconstructing the familial history of Europe , 2018 .

[33]  M. Krówczyńska,et al.  Spatial analysis of asbestos exposure and occupational health care in Poland during the period 2004-2013. , 2018, Geospatial health.

[34]  M. Krówczyńska,et al.  Asbestos Exposure and the Mesothelioma Incidence in Poland , 2018, International journal of environmental research and public health.

[35]  M. Krówczyńska,et al.  Environmental and Occupational Exposure to Asbestos as a Result of Consumption and Use in Poland , 2019, International journal of environmental research and public health.

[36]  E. Alzate Modelos de mezclas Bernoulli con regresión logística: una aplicación en la valoración de carteras de crédito , 2020 .

[37]  Cardona Alzate,et al.  Predicción y selección de variables con bosques aleatorios en presencia de variables correlacionadas , 2020 .