Compressive sensing is used to improve the efficiency of seismic data acquisition and survey design. Nevertheless, most methods are ad hoc, and their only aim is to fill in the gaps in the data. Algorithms might be able to predict missing receivers’ values, however, it is also desirable to be able to associate each prediction with a degree of uncertainty. We used beta process factor analysis (BPFA) and its variance. With this, we achieved high correlation between uncertainty and respective reconstruction error. Comparisons with other algorithms in the literature and results on synthetic and field data illustrate the advantages of using BPFA for uncertainty quantification. This could be useful when modeling the degree of uncertainty for different source/receiver configurations to guide future seismic survey design. INTRODUCTION Seismic data acquisition involves sampling the seismic wavefield at or near the earth’s surface. A source at the surface creates a wavefield that is reflected and refracted by changes in impedance. Surface receivers record the reflected wavefield generally on a regular grid. But some of those receivers may be missing, caused either by malfunction or because they could not be placed in the survey’s required location (e.g., because of a surface obstruction). To overcome this, signal reconstruction algorithms are used to replace or restore the output of the missing receivers (traces). Most of the modern algorithms use the principle of compressive sensing (CS), which uses the assumption that the signal of interest is either sparse (a few nonzero elements) in nature or in some other basis. In the seismic CS literature, sparsity is assumed using the Fourier (Sacchi et al., 1998), the Radon (Trad et al., 2002), the curvelet (Herrmann and Hennenfent, 2008), or the focal transform (Kutscha and Verschuur, 2016) to name a few. A popular method that uses the Fourier transform is the projection onto convex sets (POCS) (Abma and Kabir, 2006), which transforms the available data and uses hard or soft thresholding (Stanton et al., 2015) to reconstruct the desired signals. Iteratively reweighted least squares were also proposed (Zwartjes and Sacchi, 2007) in the Fourier domain. Spectral projected gradient for L1 (SPGL1) (van den Berg and Friedlander, 2009) is another method that solves the problem by using a predefined dictionary of basis functions that provide a sparse representation to solve the l1-norm minimization problem. Tensor completion (Kreimer and Sacchi, 2011, 2012) algorithms were also proposed to scale for larger dimensions. Other techniques with prediction filters use nonaliased low frequencies of seismic data to reconstruct the aliased parts (Spitz, 1991; Porsani, 1999; Naghizadeh and Sacchi, 2007). Alternative methods have been proposed that do not require a predefined dictionary of basis functions to be used such as Beckouche and Ma (2014), Zhu et al. (2015), and Turquais et al. (2015). These techniques train on available seismic data and learn a dictionary that can be used for sparse representation. Recently, new approaches in seismic CS have been proposed that use principles from the Bayesian statistics and machine learning literature. The relevance vector machine (RVM) (Tipping, 2001; Tipping and Faul, 2003) has been applied to seismic interpolation (Pilikos and Faul, 2016) with success, matching the performance of the SPGL1 for time slices. In addition, Pilikos and Faul (2016) use an uncertainty measure improvement and illustrate an uncertainty map for the predictions. Another method that has been used is the beta process factor analysis (BPFA) to learn a dictionary of basis from the available seismic data for interpolation and denoising (Pilikos and Faul, 2017). It has been shown that BPFA outperforms the SPGL1 and POCS when processing time slices. Furthermore, when the data are reordered in the shot record domain, there is no sign of aliasing in the frequency wavenumber (f-k) domain (Pilikos et al., 2017). Manuscript received by the Editor 27 February 2018; revised manuscript received 17 August 2018; published ahead of production 21 November 2018; published online 11 February 2019. University of Cambridge, Laboratory for Scientific Computing, Maxwell Centre, Department of Physics, J. J. Thomson Avenue, Cambridge CB3 0HE, UK. E-mail: ggp29@cam.ac.uk; acf22@cam.ac.uk. © 2019 Society of Exploration Geophysicists. All rights reserved. P15 GEOPHYSICS, VOL. 84, NO. 2 (MARCH-APRIL 2019); P. P15–P25, 13 FIGS., 5 TABLES. 10.1190/GEO2018-0145.1 D ow nl oa de d 02 /1 1/ 19 to 9 0. 20 5. 13 6. 19 0. R ed is tr ib ut io n su bj ec t t o SE G li ce ns e or c op yr ig ht ; s ee T er m s of U se a t h ttp :// lib ra ry .s eg .o rg / The core idea behind Bayesian machine learning is the construction of models using probability distributions over random variables. This provides flexibility in modeling because it is possible to incorporate prior knowledge and guide the model. Furthermore, it allows the model to provide uncertainty information about its predictions. Bayesian statistics has a long history in solving inverse problems in geophysics. Duijndam (1988a, 1988b) gives a comprehensive introduction to the field, and later Ulrych et al. (2001) write a tutorial with seismic applications. Malinverno and Briggs (2004) expand this using empirical Bayes for uncertainty quantification. Other applications of Bayesian estimation can be found in Wang et al. (2008) for seismic wavefield separation and Fjeldstad and Grana (2018) for petrophysics-seismic inversion to name a few. In this paper, we will use Bayesian machine learning, as mentioned in the previous paragraph, for seismic CS to create probabilistic data-driven models and at the same time create uncertainty maps. To avoid confusion, this type of models do not refer to velocity models but rather to constructions with general assumptions that are adaptable to the available data. The first model that we will use is the RVM. This model uses a sparsity promoting prior distribution in the form of a hyper-prior over the coefficients of a linear combination of basis functions. By learning the appropriate parameters, the model can provide a predictive mean and predictive variance of the desired model. These can then be used for prediction and uncertainty quantification. Nevertheless, the predictive variance was found to behave counter-intuitively when using basis functions with finite support (Rasmussen and Quiñonero Candela, 2005), providing small uncertainty when data points are far from the model and vice versa. To overcome the problematic predictive variance, Faul and Pilikos (2016) propose a new uncertainty measure for the RVM. This calculates the expected change in the likelihood that the predicted data point would have. Using this proposed measure, Pilikos and Faul (2016) apply it to seismic data with some preliminary results. The second model that we will use is the BPFA (Zhou et al., 2012). This model uses a different approach to enforce sparsity on the coefficients of the linear combination of the desired model. This is achieved using a Bernoulli distribution to control whether a coefficient is zero or not. The parameter that controls the Bernoulli distribution is governed by a Beta distribution to allow flexibility in the level of sparsity. In addition, this is then element-wise multiplied with a normal distribution to produce the value of a desired coefficient. This method of modeling provides exact zero coefficients as opposed to the RVM. Another advantage is that it also learns a dictionary of basis from the available data. This provides another level of flexibility that makes it more accurate in reconstructions. BPFA was compared with other algorithms in the literature and obtained state-of-the-art reconstructions without any signs of aliasing in the f-k domain (Pilikos and Faul, 2017; Pilikos et al., 2017). In this paper, we propose to use BPFA to create uncertainty maps. We calculate the variance for each prediction obtained by the inference process that uses Gibbs sampling. Using the variance, we show that it is possible to obtain much better uncertainty maps for the reconstructed signals compared with others in the literature. The structure of the paper is as follows: first, an introduction to Bayesian machine learning is given, providing basic definitions and explanation of the RVM. Various modifications to the predictive variance of the RVM are also discussed. In addition, BPFA is described along with how the variance of its predictions can be used to create uncertainty maps for seismic CS. Experiments on sections of time slices are provided along with representative uncertainty maps and reconstructions for all algorithms. Furthermore, uncertainty maps for shot records are provided. A thorough comparison and analysis on thousands of uncertainty maps using different methods illustrates their performance in detail using the Spearman’s correlation coefficient. Stacking of uncertainty maps is also provided that improves the correlation with the reconstruction error. Finally, we include an example on field data along with conclusions. BAYESIAN MACHINE LEARNING Using data-driven models to describe real-world observations has increased in popularity. Uncertainty is an integral part of the model and the measurements, and models that are able to capture it are very desirable. Bayesian machine learning is a framework that tackles this by allowing the construction of models using probability distributions over random variables. The Bayes rule is defined by pðΘjKÞ 1⁄4 pðKjΘÞpðΘÞ pðKÞ ; (1) where Θ is the collection of all unknown variables and K are the available observations. The term pðΘÞ is the prior distribution of the variables that capture our prior belief of how they are distributed, pðKjΘÞ is the likelihood function that gives the probability of the observations being generated using a respective configuration of Θ, and pðKÞ is the distrib
[1]
A. Stanton,et al.
Mitigating Artifacts in Projection Onto Convex Sets Interpolation
,
2015
.
[2]
Lawrence Carin,et al.
Nonparametric factor analysis with beta process priors
,
2009,
ICML '09.
[3]
Tadeusz J. Ulrych,et al.
A Bayes tour of inversion: A tutorial
,
2001
.
[4]
D. J. Verschuur,et al.
The utilization of the double focal transformation for sparse data representation and data reconstruction
,
2016
.
[5]
R. Abma,et al.
3D interpolation of irregular data with a POCS algorithm
,
2006
.
[6]
Milton J. Porsani,et al.
Seismic trace interpolation using half-step prediction filters
,
1999
.
[7]
A. C. Faul,et al.
Relevance Vector Machines with Uncertainty Measure for Seismic Bayesian Compressive Sensing and Survey Design
,
2016,
2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).
[8]
A. Duijndam.
BAYESIAN ESTIMATION IN SEISMIC INVERSION. PART II: UNCERTAINTY ANALYSIS1
,
1988
.
[9]
Guillermo Sapiro,et al.
Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations
,
2009,
NIPS.
[10]
D. Grana,et al.
Joint probabilistic petrophysics-seismic inversion based on Gaussian mixture and Markov chain prior models
,
2018
.
[11]
Mauricio D. Sacchi,et al.
Interpolation and extrapolation using a high-resolution discrete Fourier transform
,
1998,
IEEE Trans. Signal Process..
[12]
A. C. Faul,et al.
Bayesian Feature Learning for Seismic Compressive Sensing and Denoising
,
2017
.
[13]
Michael P. Friedlander,et al.
Probing the Pareto Frontier for Basis Pursuit Solutions
,
2008,
SIAM J. Sci. Comput..
[14]
David B. Dunson,et al.
Nonparametric Bayesian Dictionary Learning for Analysis of Noisy and Incomplete Images
,
2012,
IEEE Transactions on Image Processing.
[15]
Carl E. Rasmussen,et al.
Healing the relevance vector machine through augmentation
,
2005,
ICML.
[16]
George Eastman House,et al.
Sparse Bayesian Learning and the Relevance Vector Machine
,
2001
.
[17]
Felix J. Herrmann,et al.
Non-parametric seismic data recovery with curvelet frames
,
2008
.
[18]
Nadia Kreimer,et al.
A tensor higher-order singular value decomposition for prestack seismic data noise reduction and interpolation
,
2012
.
[19]
Rayan Saab,et al.
Bayesian wavefield separation by transform-domain sparsity promotion
,
2008
.
[20]
James H. McClellan,et al.
Seismic data denoising through multiscale and sparsity-promoting dictionary learning
,
2015
.
[21]
Alberto Malinverno,et al.
Expanded uncertainty quantification in inverse problems: Hierarchical Bayes and empirical Bayes
,
2004
.
[22]
Hassan Mansour,et al.
Efficient matrix completion for seismic data reconstruction
,
2015
.
[23]
Jianwei Ma,et al.
Simultaneous dictionary learning and denoising for seismic data
,
2014
.
[24]
Mauricio D. Sacchi,et al.
Fourier Reconstruction of Nonuniformly Sampled, Aliased Seismic Data
,
2022
.
[25]
S. Spitz.
Seismic trace interpolation in the F-X domain
,
1991
.
[26]
Michael E. Tipping,et al.
Fast Marginal Likelihood Maximisation for Sparse Bayesian Models
,
2003
.
[27]
Walter Söllner,et al.
Dictionary learning for signal-to-noise ratio enhancement
,
2015
.
[28]
Mauricio D. Sacchi,et al.
Accurate interpolation with high-resolution time-variant Radon transforms
,
2002
.
[29]
Mauricio D. Sacchi,et al.
Multistep autoregressive reconstruction of seismic records
,
2007
.
[30]
A. Duijndam.
BAYESIAN ESTIMATION IN SEISMIC INVERSION. PART I: PRINCIPLES1
,
1988
.
[31]
Nadia Kreimer,et al.
A Tensor Higher-order Singular Value Decomposition (HOSVD) For Pre-stack Simultaneous Noise-reduction And Interpolation
,
2011
.