Ensemble-based big data analytics of lithofacies for automatic development of petroleum reservoirs

Abstract Big data-driven ensemble learning is explored in this paper for quantitative geological lithofacies modeling, which is an integral and challenging part of petroleum reservoir development and characterization. Quantitative lithofacies modeling involves detection and recognition of underlying subsurface rock’s lithofacies. It requires real-time data acquisition, handling, storage, conditioning, analysis, and interpretation of raw sensory petroleum logging data. The real-time well-logs data collected from the sensor-based tools suffer from complications such as noise, nonlinearity, imbalance, and high-dimensionality which makes the prediction task more challenging. The existing literature on quantitative lithofacies modeling includes several data-driven techniques ranging from conventional well-logs to artificial intelligence (AI). Recently, multiple classifiers based Ensemble learners have been found to be more robust and reliable paradigms for detection and identification tasks in various machine learning applications, however, these are not well embraced in the petroleum industry. Ensemble methodology combines diverse expert’s opinions to obtain overall ensemble decision which in turn reduces the risk of a wrong decision. Thus, the uncertainties associated with complex reservoir data can be better handled by the use of Ensemble learners than the existing single learner based conventional models. Ensemble-based big data analytics, proposed in the paper, includes development and comparative performance testing of five popular ensemble methods (viz. Bagging, AdaBoost, Rotation forest, Random subspace, and DECORATE) for quantitative lithofacies modeling. Seven state-of-the-art base classifiers were used as members of different Ensemble learners for the analysis of Kansas (U.S.A.) oil-field data. The proposed techniques have been implemented on the widely used WEKA platform. The comparative performance analysis of the proposed techniques, presented in the paper, confirms its supremacy over the existing techniques used for quantitative lithofacies modeling.

[1]  L. Baker,et al.  Optimisation of pedotransfer functions using an artificial neural network ensemble method , 2008 .

[2]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[3]  A. I. Marqués,et al.  Exploring the behaviour of base classifiers in credit scoring ensembles , 2012, Expert Syst. Appl..

[4]  Ali Moradzadeh,et al.  Classification and identification of hydrocarbon reservoir lithofacies and their heterogeneity using seismic attributes, logs data and artificial neural networks , 2012 .

[5]  Nishikant Mishra,et al.  Social media data analytics to improve supply chain management in food industries , 2017, Transportation Research Part E: Logistics and Transportation Review.

[6]  J. Friedman Regularized Discriminant Analysis , 1989 .

[7]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[8]  Tapan Mukerji,et al.  Seismic Lithofacies Classification From Well Logs Using Statistical Rock Physics , 2002 .

[9]  Thomas Lengauer,et al.  Classification with correlated features: unreliability of feature ranking and solutions , 2011, Bioinform..

[10]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[11]  Thomas Lengauer,et al.  Permutation importance: a corrected feature importance measure , 2010, Bioinform..

[12]  Dean S. Oliver,et al.  THE ENSEMBLE KALMAN FILTER IN RESERVOIR ENGINEERING-A REVIEW , 2009 .

[13]  Guozhong An,et al.  The Effects of Adding Noise During Backpropagation Training on a Generalization Performance , 1996, Neural Computation.

[14]  Jane Labadin,et al.  Applied Soft Computing , 2014 .

[15]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[16]  Francisco Javier García Castellano,et al.  Expert Systems With Applications , 2022 .

[17]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[18]  J A Swets,et al.  Better decisions through science. , 2000, Scientific American.

[19]  Timothy R. Carr,et al.  Comparison of supervised and unsupervised approaches for mudstone lithofacies classification: Case studies from the Bakken and Mahantango-Marcellus Shale, USA , 2016 .

[20]  Syed Mithun Ali,et al.  Barriers to big data analytics in manufacturing supply chains: A case study from Bangladesh , 2019, Comput. Ind. Eng..

[21]  Vincenzo Lipari,et al.  A machine learning approach to facies classification using well logs , 2017 .

[22]  B. Chae,et al.  Insights from hashtag #supplychain and Twitter Analytics: Considering Twitter and Twitter data for supply chain practice and research , 2015 .

[23]  Joaquín Abellán,et al.  Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring , 2014, Expert Syst. Appl..

[24]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Sunil Tiwari,et al.  Big data analytics in supply chain management between 2010 and 2016: Insights to industries , 2018, Comput. Ind. Eng..

[28]  Xin Yao,et al.  Short-term load forecasting with neural network ensembles: A comparative study , 2011 .

[29]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[30]  Fateh Chebana,et al.  Estimation of ice thickness on lakes using artificial neural network ensembles , 2010 .

[31]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[32]  Nagesh Shukla,et al.  A fuzzy rough sets-based multi-agent analytics framework for dynamic supply chain configuration , 2016 .