Natural Language Processing Techniques on Oil and Gas Drilling Data
暂无分享,去创建一个
Recent advances in search, machine learning, and natural language processing have made it possible to extract structured information from free text, providing a new and largely untapped source of insights for well and reservoir planning. However, there are major challenges involved in applying these techniques to data that is messy and/or lacking a labeled training set; we cover some of the methods in which these problems can be overcome. We present a method to compare the distribution of hypothesized and realized risks to oil wells described in two datasets that contain free-text descriptions of risks. We treat one dataset as a training set for a logistic regression classifier, and then use this classifier to label in the events in the other, out-of-domain dataset. To adjust for differences between the datasets, we rebalance the training set and supplement it with labeled instances automatically extracted from the test set. These simple domain adaptation techniques allow us to achieve an average F1 score of 0.84 on the out-of-domain test set. Introduction In the oil and gas industry, risk identification and assessment is a critical business practice. This holds particularly true during the drilling stages, which cannot begin before a risk assessment is conducted to understand what risks are possible. While these risk assessments are typically conducted in a group setting (in an aptly titled Risk Assessment Meeting), the project drilling engineer will usually have a predetermined list of risks and likelihood scores that are the focus of the conversation. One problem with this approach is that the drilling engineer is inherently biased by personal experiences, which can affect their view on how likely an event is to happen. For example, if the project drilling engineer recently encountered well control issues, they will likely over-estimate the chance of future well control issues; on the other hand, if they have never encountered a well control issue, it may be unintentionally omitted in their risk assessments altogether. Both scenarios pose problems, and the latter may become even more prevalent during the Big Crew Change, since newer drilling engineers could lack both the experience to assess the full array of risks, and the mentors/guidance to correct their oversight. Using historical data as a barometer could help the drilling engineer overcome these issues, though doing so requires a unified view of both prior risk assessments, and prior issues encountered. Chevron possesses both pieces of data, though in disparate systems: • Risk Assessment (A) database contains descriptions of risks from historical risk assessments • Well Operations Database (B) contains descriptions of unexpected events and associated unexpected event (UE) codes, which categorize the unexpected events. All final manuscripts will be sent through an XML markup process that will alter the LAYOUT. This will NOT alter the content in any way. 2 SPE-SPE-181015-MS-MS Leveraging both, we create a system which allows a project drilling engineer to enter a risk in natural language, return drilling codes related to this risk, produce statistics showing how often these types of events have happened in the past, and predict the likelihood of the problem occurring in certain fields. Conductor
not
set
in
line
or
plumb
w/
other
wells
on
pad. Excessive
water
flow
from
formations. Casing
becomes
stuck. Lose
lateral
length
from
laying
down
casing
joints. Stripper
Rubber
leaking
causing
uncontrolled
gas. Table 1: Risk Assessment (A) examples.