A novel data assimilation methodology for predicting lithology based on sequence labeling algorithms

A hidden Markov model (HMM) and a conditional random fields (CRFs) model for lithological predictions based on multiple geophysical well-logging data are derived for dealing with directional nonstationarity through bidirectional training and conditioning. The developed models were benchmarked against their conventional counterparts, and hypothetical boreholes with the corresponding synthetic geophysical data including artificial errors were employed. In the three test scenarios devised, the average fitness and unfitness values of the developed CRFs model and HMM are 0.84 and 0.071 and 0.81 and 0.084, respectively, while those of the conventional CRFs model and HMM are 0.78 and 0.091 and 0.77 and 0.099, respectively. Comparisons of their predictabilities show that the models designed for directional nonstationarity clearly perform better than the conventional models for all tested examples. Among them, the developed linear-chain CRFs model showed the best or close to the best performance with high predictability and a low training data requirement.

[1]  G. Wadge,et al.  Inferring the lithology of borehole rocks by applying neural network classifiers to downhole logs: an example from the Ocean Drilling Program , 1999 .

[2]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[3]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[4]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[5]  Roman Klinger,et al.  Classical Probabilistic Models and Conditional Random Fields , 2007 .

[6]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[7]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8]  Eungyu Park,et al.  A multidimensional, generalized coupled Markov chain model for surface and subsurface characterization , 2010 .

[9]  Henning Omre,et al.  Approximate posterior distributions for convolutional two-level hidden Markov models , 2013, Comput. Stat. Data Anal..

[10]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[11]  Alex Smirnoff,et al.  Support vector machine for 3D modelling from sparse geological information of various origins , 2008, Comput. Geosci..

[12]  Pedro M. Domingos,et al.  Discriminative Training of Markov Logic Networks , 2005, AAAI.

[13]  Jonathan Hall,et al.  Estimation Of Critical Formation Evaluation Parameters Using Techniques Of Neurocomputing , 1995 .

[14]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[15]  Hanna M. Wallach,et al.  Efficient Training of Conditional Random Fields , 2002 .

[16]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[17]  P. Switzer,et al.  Estimation of Geological Attributes from a Well Log: An Application of Hidden Markov Chains , 2004 .

[18]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[19]  R. L. Stratonovich CONDITIONAL MARKOV PROCESSES , 1960 .

[20]  Hidden Markov Chains for Identifying Geological Features from Seismic Data , 2005 .