Optimization of Models for Rapid Identification of Oil and Water Layers During Drilling - A Win-Win Strategy Based on Machine Learning

The identification of oil and water layers (OWL) from well log data is an important task in petroleum exploration and engineering. At present, the commonly used methods for OWL identification are time-consuming, low accuracy or need better experience of researchers. Therefore, some machine learning methods have been developed to identify the lithology and OWL. Based on logging while drilling data, this paper optimizes machine learning methods to identify OWL while drilling. Recently, several computational algorithms have been used for OWL identification to improve the prediction accuracy. In this paper, we evaluate three popular machine learning methods, namely the one-against-rest support vector machine, one-against-one support vector machine, and random forest. First, we choose apposite training set data as a sample for model training. Then, GridSearch method was used to find the approximate range of reasonable parameters' value. And then using k-fold cross validation to optimize the final parameters and to avoid overfitting. Finally, choosing apposite test set data to verify the model. The method of using machine learning method to identify OWL while drilling has been successfully applied in Weibei oilfield. We select 1934 groups of well logging response data for 31 production wells. Among them, 198 groups of LWD data were selected as the test set data. Natural gamma, shale content, acoustic time difference, and deep-sensing logs were selected as input feature parameters. After GridSearch and 10-fold cross validation, the results suggest that random forest method is the best algorithm for supervised classification of OWL using well log data. The accuracy of the three classifiers after the calculation of the training set is greater than 90%, but their differences are relative large. For the test set, the calculated accuracy of the three classifiers is about 90%, with a small difference. The one-against-rest support vector machine classifier spends much more time than other methods. The one-against-one support vector machine classifier is the classifier which training set accuracy and test set accuracy are the lowest in three methods. Although all the calculation results have diffierences in accuracy of OWL identification, their accuracy is relatively high. For different reservoirs, taking into account the time cost and model calculation accuracy, we can use random forest and one-against-one support vector machine models to identify OWL in real time during drilling.