A comparison of random forest and support vector machine approaches to predict coal spontaneous combustion in gob

Abstract The accurate prediction of coal temperature plays a vital role in preventing and controlling the spontaneous combustion of coal in coal mines. In this study, a long-term in-situ observation experiment was conducted in a fully mechanized caving face of the Dafosi Coal Mine, where the in-situ data of gases and temperature were obtained. Two machine learning approaches, random forest (RF) and support vector machine (SVM) were introduced and compared for predicting coal spontaneous combustion based on the in-situ monitoring data. The particle swarm optimization (PSO) was employed to optimize the RF and SVM by finding their optimal hyper-parameters. Principal component analysis (PCA) was used to transform the original input data into a new dataset of uncorrelated variables, reducing dimension for input variables. The results indicated that regardless of whether the models with or without PCA, the RF model was more robust than the SVM model and less affected by its own parameters, while the SVM model was highly sensitive to its parameters. Although the PSO could find the optimal hyper-parameters of the RF model, the RF model with default parameters could also accurately predict coal spontaneous combustion and possess satisfactory generalization. However, the predictive performance of the SVM model was dramatically improved in predicting after the PSO optimization. Moreover, the models with PCA also showed the above characteristics. These results suggest that both the RF and SVM methods can be used to predict coal spontaneous combustion, while the RF method can obtain accurate predictions without special parameter settings, it is more suitable for practical applications and can potentially be further employed as a reliable method for the determination of complicated relationships.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Chao Gao,et al.  Prediction of soil organic carbon in an intensively managed reclamation zone of eastern China: A comparison of multiple linear regressions and the random forest model. , 2017, The Science of the total environment.

[3]  Martin Kappas,et al.  Comparison of Multiple Linear Regression, Cubist Regression, and Random Forest Algorithms to Estimate Daily Air Surface Temperature from Dynamic Combinations of MODIS LST Data , 2017, Remote. Sens..

[4]  Gang Wang,et al.  Early detection of spontaneous combustion of coal in underground coal mines with development of an ethylene enriching system , 2011 .

[5]  A. Trigila,et al.  Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy) , 2015 .

[6]  Claudia Kuenzer,et al.  Coal fires in China over the last decade: A comprehensive review , 2014 .

[7]  Erik J. Bekkers,et al.  Retinal vessel delineation using a brain-inspired wavelet transform and random forest , 2017, Pattern Recognit..

[8]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[9]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[10]  S. S. Matin,et al.  Explaining relationships between coke quality index and coal properties by Random Forest method , 2016 .

[11]  Tamer Khatib,et al.  A novel hybrid model for hourly global solar radiation prediction using random forests technique and firefly algorithm , 2017 .

[12]  Qiong Li,et al.  On-line monitoring the performance of coal-fired power unit: A method based on support vector machine , 2009 .

[13]  Yacine Rezgui,et al.  Trees vs Neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption , 2017 .

[14]  Jun Deng,et al.  Experimental studies of spontaneous combustion and anaerobic cooling of coal , 2015 .

[15]  Jun Deng,et al.  Comparative analysis of thermokinetic behavior and gaseous products between first and second coal spontaneous combustion , 2018, Fuel.

[16]  Antanas Verikas,et al.  Mining data with random forests: A survey and results of new tests , 2011, Pattern Recognit..

[17]  Ping Liu,et al.  A comparison of random forest regression and multiple linear regression for prediction in neuroscience , 2013, Journal of Neuroscience Methods.

[18]  Pao-Shan Yu,et al.  Comparison of random forests and support vector machine for real-time radar-derived rainfall forecasting , 2017 .

[19]  Samia Boukir,et al.  Relevance of airborne lidar and multispectral image data for urban scene classification using Random Forests , 2011 .

[20]  Yang Chen,et al.  Development of a spontaneous combustion TARPs system based on BP neural network , 2015 .

[21]  Dengji Li,et al.  The relationship between oxygen consumption rate and temperature during coal spontaneous combustion , 2012 .

[22]  Claudia Kuenzer,et al.  Geomorphology of coal seam fires , 2012 .

[23]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[24]  S. S. Matin,et al.  Modeling of free swelling index based on variable importance measurements of parent coal properties by random forest method , 2016 .

[25]  John L. Bailey,et al.  Random forests as cumulative effects models: A case study of lakes and rivers in Muskoka, Canada. , 2017, Journal of environmental management.

[26]  Hamid Reza Pourghasemi,et al.  Erratum to: Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia , 2016, Landslides.

[27]  Dinggang Shen,et al.  Automatic cystocele severity grading in transperineal ultrasound by random forest regression , 2017, Pattern Recognit..

[28]  N. K. Shukla,et al.  Mine fire gas indices and their application to Indian underground coal mine fires , 2007 .

[29]  G. Lemasters,et al.  Exposure assessment models for elemental components of particulate matter in an urban environment: A comparison of regression and random forest approaches. , 2017, Atmospheric environment.

[30]  Guohua Cao,et al.  Support vector regression with fruit fly optimization algorithm for seasonal electricity consumption forecasting , 2016 .

[31]  Sungzoon Cho,et al.  Approximating support vector machine with artificial neural network for fast prediction , 2014, Expert Syst. Appl..

[32]  S. S. Matin,et al.  Estimation of coal gross calorific value based on various analyses by random forest method , 2016 .

[33]  Kenji Fukumizu,et al.  Relation between weight size and degree of over-fitting in neural network regression , 2008, Neural Networks.

[34]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[35]  Jun Wang,et al.  Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: Application to the recognition of orange beverage and Chinese vinegar , 2013 .

[36]  Yueping Qin,et al.  A quantitative approach to evaluate risks of spontaneous combustion in longwall gobs based on CO emissions at upper corner , 2017 .

[37]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[38]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[39]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[40]  Shengqiang Yang,et al.  Coal spontaneous combustion prediction in gob using chaos analysis on gas indicators from upper tunnel , 2015 .

[41]  S. S. Matin,et al.  Explaining relationships among various coal analyses with coal grindability index by Random Forest , 2016 .

[42]  S. S. Matin,et al.  Prediction of froth flotation responses based on various conditioning parameters by Random Forest method , 2017 .

[43]  Hongqing Zhu,et al.  Comprehensive evaluation on self-ignition risks of coal stockpiles using fuzzy AHP approaches , 2014 .

[44]  Jun Deng,et al.  Determination and prediction on “three zones” of coal spontaneous combustion in a gob of fully mechanized caving face , 2018 .

[45]  Chih-Hung Wu,et al.  A Novel hybrid genetic algorithm for kernel function and parameter optimization in support vector regression , 2009, Expert Syst. Appl..

[46]  Yunqian Ma,et al.  Practical selection of SVM parameters and noise estimation for SVM regression , 2004, Neural Networks.

[47]  Gerhard Tutz,et al.  Random forest for ordinal responses: Prediction and variable selection , 2016, Comput. Stat. Data Anal..

[48]  Feng Liu,et al.  Comparison of boosted regression tree and random forest models for mapping topsoil organic carbon concentration in an alpine ecosystem , 2016 .

[49]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[50]  K. Zhao,et al.  A comparison of Gaussian process regression, random forests and support vector regression for burn severity assessment in diseased forests , 2014 .

[51]  Dayou Liu,et al.  Evolving support vector machines using fruit fly optimization for medical data classification , 2016, Knowl. Based Syst..

[52]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[53]  Eliseo Monfort,et al.  Environmental characterization of burnt coal gangue banks at Yangquan, Shanxi Province, China , 2008 .

[54]  Yanyun Zhao,et al.  An intelligent gel designed to control the spontaneous combustion of coal: Fire prevention and extinguishing properties , 2017 .

[55]  Jean-Michel Poggi,et al.  Variable selection using random forests , 2010, Pattern Recognit. Lett..

[56]  Xianliang Meng,et al.  Prediction of oxygen concentration and temperature distribution in loose coal based on BP neural network , 2009 .

[57]  Tongqiang Xia,et al.  A fully coupled hydro-thermo-mechanical model for the spontaneous combustion of underground coal seams , 2014 .

[58]  D. Bui,et al.  A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. , 2015 .

[59]  Jianjun Wu,et al.  Risk assessment of underground coal fire development at regional scale , 2011 .