Improving uncertainty analysis in well log classification by machine learning with a scaling algorithm

Abstract Uncertainty is an important indicator that can provide the confidence in the predictions. However, most machine learning methods in classification problems are incapable of predicting an accurate probability that matches the observed distribution of each class. To overcome this problem and improve the uncertainty analysis in lithofacies classification, a novel calibration process is performed on the validation subset using scaling algorithms. The validation subset should be separated from the training data in order to reduce overfitting and avoid unwanted biases. Random forest is selected as the machine learning classifier owing to its relative ease of use, and can handle high-dimensional input features quite well. The proposed approach is applied to two real datasets to demonstrate the improvement in the uncertainty quantification. Reliability diagram shows that the calibrated probability is closer to the diagonal truth, compared with the uncalibrated one. In addition to the visual check, a multiclass Brier score of calibrated model is lower than that by uncalibrated system, which can be referred to as a loss function. For the classification results by calibrated and uncalibrated classifiers on the testing subset, an increase of the Matthews correlation coefficient also means that the calibrated model has a better performance.

[1]  Timothy R. Carr,et al.  Comparison of supervised and unsupervised approaches for mudstone lithofacies classification: Case studies from the Bakken and Mahantango-Marcellus Shale, USA , 2016 .

[2]  Swapan Chakrabarti,et al.  Comparison of four approaches to a rock facies classification problem , 2007, Comput. Geosci..

[3]  Abraham J. Wyner,et al.  Making Sense of Random Forest Probabilities: a Kernel Perspective , 2018, ArXiv.

[4]  Vasily Demyanov,et al.  Value of Geologically Derived Features in Machine Learning Facies Classification , 2019, Mathematical Geosciences.

[5]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[6]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[7]  Miguel Bosch,et al.  Lithology discrimination from physical rock properties , 2002 .

[8]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[9]  D. Grana,et al.  Lithofacies classification of a geothermal reservoir in Denmark and its facies-dependent porosity estimation from seismic inversion , 2020 .

[10]  H. D. Brunk,et al.  AN EMPIRICAL DISTRIBUTION FUNCTION FOR SAMPLING WITH INCOMPLETE INFORMATION , 1955 .

[11]  Runhai Feng,et al.  A Bayesian Approach in Machine Learning for Lithofacies Classification and Its Uncertainty Analysis , 2021, IEEE Geoscience and Remote Sensing Letters.

[12]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[13]  S. Laubach,et al.  Advances in carbonate exploration and reservoir analysis , 2012 .

[14]  Quincy Chen,et al.  Seismic attribute technology for reservoir forecasting and monitoring , 1997 .

[15]  Erika Angerer,et al.  Reservoir lithology classification based on seismic inversion results by Hidden Markov Models: Applying prior geological information , 2018 .

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  Y. Z. Ma Facies and Lithofacies Classifications from Well Logs , 2019, Quantitative Geosciences: Data Analytics, Geostatistics, Reservoir Characterization and Modeling.

[18]  Ali Moradzadeh,et al.  Classification and identification of hydrocarbon reservoir lithofacies and their heterogeneity using seismic attributes, logs data and artificial neural networks , 2012 .

[19]  Robert Hardisty,et al.  Seismic-facies classification using random forest algorithm , 2018, SEG Technical Program Expanded Abstracts 2018.

[20]  F. Buekenhout,et al.  The number of nets of the regular convex polytopes in dimension <= 4 , 1998, Discret. Math..

[21]  Leonard A. Smith,et al.  Increasing the Reliability of Reliability Diagrams , 2007 .

[22]  M. Evans Statistical Distributions , 2000 .

[23]  Tapan Mukerji,et al.  Mapping lithofacies and pore‐fluid probabilities in a North Sea reservoir: Seismic inversions and statistical rock physics , 2001 .

[24]  Jianhua He,et al.  Logging identification and characteristic analysis of the lacustrine organic-rich shale lithofacies- A case study from the ES3L shale in the Jiyang Depression, Bohai Bay Basin, Eastern China , 2016 .

[25]  Runhai Feng,et al.  Lithofacies classification based on a hybrid system of artificial neural networks and hidden Markov models , 2020 .

[26]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.