Abstract For petroleum exploration and engineering, the classification of underground formation lithology from well log data is a task of considerable import. This is because, lithology classification is the basis of reservoir parameter calculations and geological research studies. With the rising prowess of cheap computational devices, large amounts of data can now be analyzed with much higher accuracy and efficiency. Hence, there have recently been increased efforts to automate lithology classification. In one such effort, Xie et al. (2018) recently evaluated five machine learning methods to classify formation lithology by using data from the Daniudui gas field (DGF) and Hangjinqi gas field (HGF). Although their tree-based ensemble models performed well, there is still scope for improvement in the predictive ability. Motivated by the encouraging results obtained from the boosting approach to machine learning in their work, we applied this approach in our work with the aim of improving upon the predictive ability of their ensemble models. Specifically, we applied the AdaBoost and LogitBoost meta-algorithms using decision stumps and random trees as base learners. We evaluated the boosted trees by calculating metrics such as precision, recall, F1-score and PRC area after implementing 5-fold and 10-fold stratified cross-validation. In our analysis, amongst the applied metaalgorithms, we found that the implementation of the LogitBoost meta-algorithm, with random tree as a base learner, possessed the highest performance metrics.
[1]
J. Friedman.
Special Invited Paper-Additive logistic regression: A statistical view of boosting
,
2000
.
[2]
Igor I. Baskin,et al.
Random Subspaces and Random Forest
,
2017
.
[3]
Mohammad Ali Sebtosheikh,et al.
Support vector machine method, a new technique for lithology prediction in an Iranian heterogeneous carbonate reservoir using petrophysical well logs
,
2015,
Carbonates and Evaporites.
[4]
Ian H. Witten,et al.
The WEKA data mining software: an update
,
2009,
SKDD.
[5]
Mario R. Eden,et al.
Comparison of Tree Based Ensemble Machine Learning Methods for Prediction of Rate Constant of Diels-Alder Reaction
,
2017
.
[6]
Takaya Saito,et al.
The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets
,
2015,
PloS one.
[7]
Lior Rokach,et al.
Decision forest: Twenty years of research
,
2016,
Inf. Fusion.
[8]
Nikunj C. Oza,et al.
Online Ensemble Learning
,
2000,
AAAI/IAAI.
[9]
Wen Zhou,et al.
Evaluation of machine learning methods for formation lithology identification: A comparison of tuning processes and model performances
,
2018
.