Evaluating the Boosting Approach to Machine Learning for Formation Lithology Classification

Abstract For petroleum exploration and engineering, the classification of underground formation lithology from well log data is a task of considerable import. This is because, lithology classification is the basis of reservoir parameter calculations and geological research studies. With the rising prowess of cheap computational devices, large amounts of data can now be analyzed with much higher accuracy and efficiency. Hence, there have recently been increased efforts to automate lithology classification. In one such effort, Xie et al. (2018) recently evaluated five machine learning methods to classify formation lithology by using data from the Daniudui gas field (DGF) and Hangjinqi gas field (HGF). Although their tree-based ensemble models performed well, there is still scope for improvement in the predictive ability. Motivated by the encouraging results obtained from the boosting approach to machine learning in their work, we applied this approach in our work with the aim of improving upon the predictive ability of their ensemble models. Specifically, we applied the AdaBoost and LogitBoost meta-algorithms using decision stumps and random trees as base learners. We evaluated the boosted trees by calculating metrics such as precision, recall, F1-score and PRC area after implementing 5-fold and 10-fold stratified cross-validation. In our analysis, amongst the applied metaalgorithms, we found that the implementation of the LogitBoost meta-algorithm, with random tree as a base learner, possessed the highest performance metrics.