A semi-supervised clustering-based approach for stratification identification using borehole and cone penetration test data

Abstract Borehole drilling and cone penetration test (CPT) are frequently employed site investigation methods for identifying subsurface stratification. However, these two methods have their respective pros and cons, and their corresponding soil type classification protocols are different. Therefore, an approach that can jointly interpret raw data from both investigation methods and provide unified soil classification results is in great demand. Motivated by the aforementioned point, this paper presents a novel semi-supervised clustering based stratification identification approach using information from both boreholes and CPT logs. The proposed approach is established on a hidden Markov random field (HMRF) framework so that the supervision constraints could be introduced by using borehole data during the clustering of CPT sounding samples. Further, the presented approach employs a Monte Carlo Expectation Maximization (MCEM) algorithm to perform the clustering process, which enables estimating the subsurface stratification in a probabilistic manner. The performances of the proposed approach are evaluated using real-world site investigation data. The test results indicate that the proposed approach is effective and robust for identifying subsurface stratification.

[1]  J. David Rogers,et al.  Subsurface Exploration using the Standard Penetration Test and the Cone Penetrometer Test , 2006 .

[2]  Hui Wang,et al.  A Segmentation Approach for Stochastic Geological Modeling Using Hidden Markov Random Fields , 2017, Mathematical Geosciences.

[3]  Robert Y. Liang,et al.  Quantifying stratigraphic uncertainties by stochastic simulation techniques based on Markov random field , 2016 .

[4]  George D. C. Cavalcanti,et al.  Semi-supervised clustering for MR brain image segmentation , 2014, Expert Syst. Appl..

[5]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[6]  Howard D. Plewes,et al.  In situ sampling, density measurements, and testing of foundation soils at Duncan Dam , 1994 .

[7]  F. Horowitz,et al.  Towards incorporating uncertainty of structural data in 3D geological inversion , 2010 .

[8]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[9]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[10]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[11]  Prabir Kumar Basudhar,et al.  Utilization of self-organizing map and fuzzy clustering for site characterization using piezocone data , 2009 .

[12]  János Fodor,et al.  Traditional and New Ways to Handle Uncertainty in Geology , 2001 .

[13]  Yu Wang,et al.  Bayesian Approach for Probabilistic Site Characterization Using Cone Penetration Tests , 2013 .

[14]  Thi Minh Hue Le,et al.  Cone penetration data classification with Bayesian Mixture Analysis , 2016 .

[15]  Robert Y. Liang,et al.  Probabilistic analysis of shield-driven tunnel in multiple strata considering stratigraphic uncertainty , 2016 .

[16]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[17]  F. Y. Wu The Potts model , 1982 .

[18]  Robert Y. Liang,et al.  A hidden Markov random field model based approach for probabilistic site characterization using multiple cone penetration test data , 2018 .

[19]  Kok-Kwang Phoon,et al.  Identification of statistically homogeneous soil layers using modified Bartlett statistics , 2003 .

[20]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[21]  G. Casella,et al.  Explaining the Gibbs Sampler , 1992 .

[22]  Paolo Gardoni,et al.  Probabilistic soil identification based on cone penetration tests , 2008 .

[23]  Yu Wang,et al.  Probabilistic identification of underground soil stratification using cone penetration tests , 2013 .

[24]  Florence Forbes,et al.  Hidden Markov Random Field Model Selection Criteria Based on Mean Field-Like Approximations , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  E. Bol The influence of pore pressure gradients in soil classification during piezocone penetration test , 2013 .

[26]  Arindam Banerjee,et al.  Semi-supervised Clustering by Seeding , 2002, ICML.

[27]  P. Lu,et al.  Predicting Geotechnical Parameters of Sands from CPT Measurements Using Neural Networks , 2002 .

[28]  Paul W. Mayne,et al.  Stratigraphic delineation by three-dimensional clustering of piezocone data , 2007 .

[29]  Inderjit S. Dhillon,et al.  Semi-supervised graph clustering: a kernel approach , 2005, Machine Learning.

[30]  P. Mayne,et al.  Geotechnical site characterization in the greater Memphis area using cone penetration tests , 2001 .

[31]  Paul W. Mayne,et al.  Objective Site Characterization Using Clustering of Piezocone Data , 2002 .

[32]  Christopher Juhlin,et al.  Soil classification analysis based on piezocone penetration test data - A case study from a quick-clay landslide site in southwestern Sweden , 2015 .

[33]  Robert Y. Liang,et al.  A Bayesian unsupervised learning approach for identifying soil stratification using cone penetration data , 2019, Canadian Geotechnical Journal.

[34]  Jianye Ching,et al.  Cone penetration test (CPT)-based stratigraphic profiling using the wavelet transform modulus maxima method , 2015 .

[35]  K. Phoon,et al.  Characterization of Geotechnical Variability , 1999 .

[36]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[37]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[38]  Peter Clifford,et al.  Markov Random Fields in Statistics , 2012 .

[39]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  P. Robertson Interpretation of cone penetration tests — a unified approach , 2009 .