Bayesian model based spatiotemporal survey designs and partially observed log Gaussian Cox process

Abstract In geostatistics, the spatiotemporal design for data collection is central for accurate prediction and parameter inference. An important class of geostatistical models is log-Gaussian Cox process (LGCP) but there are no formal analyses on spatial or spatiotemporal survey designs for them. In this work, we study traditional balanced and uniform random designs in situations where analyst has prior information on intensity function of LGCP and show that the traditional balanced and random designs are not efficient in such situations. We also propose a new design sampling method, a rejection sampling design, which extends the traditional balanced and random designs by directing survey sites to locations that are a priori expected to provide most information. We compare our proposal to the traditional balanced and uniform random designs using the expected average predictive variance (APV) loss and the expected Kullback–Leibler (KL) divergence between the prior and the posterior for the LGCP intensity function in simulation experiments and in a real world case study. The APV informs about expected accuracy of a survey design in point-wise predictions and the KL-divergence measures the expected gain in information about the joint distribution of the intensity field. The case study concerns planning a survey design for analyzing larval areas of two commercially important fish stocks on Finnish coastal region. Our experiments show that the designs generated by the proposed rejection sampling method clearly outperform the traditional balanced and uniform random survey designs. Moreover, the method is easily applicable to other models in general.

[1]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[2]  S D.,et al.  Going off grid: Computationally efficient inference for log-Gaussian Cox processes , 2015 .

[3]  Avishek Chakraborty,et al.  Point pattern modelling for degraded presence‐only data over large regions , 2011 .

[4]  Haavard Rue,et al.  A toolbox for fitting complex spatial point process models using integrated nested Laplace approximation (INLA) , 2012, 1301.1817.

[5]  J. Møller,et al.  Handbook of Spatial Statistics , 2008 .

[6]  F. Y. Edgeworth,et al.  The theory of statistics , 1996 .

[7]  D. Warton,et al.  Correction note: Poisson point process models solve the “pseudo-absence problem” for presence-only data in ecology , 2010, 1011.3319.

[8]  Håvard Rue,et al.  A toolbox for fitting complex spatial point process models using integrated Laplace transformation (INLA) , 2010 .

[9]  M. Schervish Theory of Statistics , 1995 .

[10]  P. Diggle,et al.  Geostatistical inference under preferential sampling , 2010 .

[11]  R. Waagepetersen,et al.  Modern Statistics for Spatial Point Processes * , 2007 .

[12]  Alan E. Gelfand,et al.  Bayesian Decision Theoretic Design for Group Sequential Medical Trials having Multivariate Patient R , 1996 .

[13]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[14]  Robert Michels,et al.  2016 in Review. , 2016, The American journal of psychiatry.

[15]  Alfred Stein,et al.  Constrained Optimization of Spatial Sampling using Continuous Simulated Annealing , 1998 .

[16]  Werner G. Müller Coffee-House Designs , 2001 .

[17]  I. Sobol Uniformly distributed sequences with an additional uniform property , 1976 .

[18]  J. Vanhatalo,et al.  Approximate inference for disease mapping with sparse Gaussian processes , 2010, Statistics in medicine.

[19]  J. Vanhatalo,et al.  Additive Multivariate Gaussian Processes for Joint Species Distribution Modeling with Heterogeneous Data , 2018, Bayesian Analysis.

[20]  Avid,et al.  Point process models for spatio-temporal distance sampling data from a large-scale survey of blue whales , 2017 .

[21]  Jarno Vanhatalo,et al.  Modeling the spatial distribution of larval fish abundance provides essential information for management , 2017 .

[22]  Hugh Sweatman,et al.  Spatially balanced designs that incorporate legacy sites , 2017 .

[23]  Soo-Chang Pei,et al.  Color quantization by 3D spherical Fibonacci lattices , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[24]  Krishna Pacifici,et al.  Integrating auxiliary data in optimal spatial design for species distribution modelling , 2018 .

[25]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[26]  Aki Vehtari,et al.  GPstuff: Bayesian modeling with Gaussian processes , 2013, J. Mach. Learn. Res..

[27]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[28]  P. Diggle,et al.  Adaptive Geostatistical Design and Analysis for Sequential Prevalence Surveys , 2015, 1509.04448.

[29]  D. Warton,et al.  Equivalence of MAXENT and Poisson Point Process Models for Species Distribution Modeling in Ecology , 2013, Biometrics.

[30]  Anthony N. Pettitt,et al.  A Review of Modern Computational Algorithms for Bayesian Optimal Design , 2016 .

[31]  Jarno Vanhatalo,et al.  Hierarchical Bayesian model reveals the distributional shifts of Arctic marine mammals , 2018 .

[32]  A. Olsen,et al.  Spatially Balanced Sampling of Natural Resources , 2004 .

[33]  J. Elith,et al.  Species Distribution Models: Ecological Explanation and Prediction Across Space and Time , 2009 .

[34]  Bartlomiej Jacek Kubica Excluding Regions Using Sobol Sequences in an Interval Branch-and-Prune Method for Nonlinear Systems , 2013, Reliab. Comput..

[35]  D. Stoyan,et al.  Statistical Analysis and Modelling of Spatial Point Patterns , 2008 .

[36]  A. O'Hagan,et al.  Bayesian inference for non‐stationary spatial covariance structure via spatial deformations , 2003 .

[37]  Andreas Lindén,et al.  Using the negative binomial distribution to model overdispersion in ecological count data. , 2011, Ecology.

[38]  Phaedon C. Kyriakidis,et al.  Geostatistical Space–Time Models: A Review , 1999 .

[39]  J. Møller,et al.  Statistical Inference and Simulation for Spatial Point Processes , 2003 .

[40]  Raphael Huser,et al.  Point process-based modeling of multiple debris flow landslides using INLA: an application to the 2009 Messina disaster , 2017, Stochastic Environmental Research and Risk Assessment.

[41]  A. O'Hagan,et al.  Curve Fitting and Optimal Design for Prediction , 1978 .

[42]  Peter J. Diggle,et al.  Inhibitory geostatistical designs for spatial prediction taking account of uncertain covariance structure , 2016, 1605.00104.

[43]  S Kullback,et al.  LETTER TO THE EDITOR: THE KULLBACK-LEIBLER DISTANCE , 1987 .

[44]  Mark S. Handcock,et al.  Some asymptotic properties of kriging when the covariance function is misspecified , 1989 .

[45]  Hugh Sweatman,et al.  Spatiotemporal modelling of crown‐of‐thorns starfish outbreaks on the Great Barrier Reef to inform control strategies , 2017 .

[46]  Rémi Bardenet,et al.  Monte Carlo Methods , 2013, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[47]  Stamatis Cambanis,et al.  13 Sampling designs for time series , 1985 .

[48]  M. E. Johnson,et al.  Minimax and maximin distance designs , 1990 .

[49]  J. Andrew Royle,et al.  An algorithm for the construction of spatial coverage designs with implementation in SPLUS , 1998 .

[50]  Perry J. Williams,et al.  Monitoring dynamic spatio-temporal ecological processes optimally. , 2017, Ecology.

[51]  S. Walker,et al.  A Bayesian approach to non‐parametric monotone function estimation , 2009 .

[52]  J. Møller,et al.  Log Gaussian Cox Processes , 1998 .

[53]  J. Vanhatalo,et al.  Integrating experimental and distribution data to predict future species patterns , 2019, Scientific Reports.

[54]  A. Gelfand,et al.  Explaining Species Distribution Patterns through Hierarchical Modeling , 2006 .

[55]  Peter J. Diggle,et al.  Bayesian Geostatistical Design , 2006 .

[56]  Niklas L. P. Lundström,et al.  Spatially Balanced Sampling through the Pivotal Method , 2012, Biometrics.

[57]  Aki Vehtari,et al.  Gaussian processes with monotonicity information , 2010, AISTATS.

[58]  J. Bernardo,et al.  Simulation-Based Optimal Design , 1999 .

[59]  David Russo,et al.  Design of an Optimal Sampling Network for Estimating the Variogram , 1984 .

[60]  Tapan Mukerji,et al.  Value of Information in the Earth Sciences: Integrating Spatial Modeling and Decision Analysis , 2016 .

[61]  James S. Clark,et al.  Generalized joint attribute modeling for biodiversity analysis: median-zero, multivariate, multifarious data , 2017 .

[62]  Justin M. J. Travis,et al.  Fitting complex ecological point process models with integrated nested Laplace approximation , 2013 .

[63]  Art B. Owen,et al.  9 Computer experiments , 1996, Design and analysis of experiments.

[64]  B L Robertson,et al.  BAS: Balanced Acceptance Sampling of Natural Resources , 2013, Biometrics.

[65]  P. Jones Making Decisions , 1971, Nature.

[66]  Tamás Vicsek,et al.  Scaling of the active zone in the Eden process on percolation networks and the ballistic deposition model , 1985 .

[67]  R. Munn,et al.  The Design of Air Quality Monitoring Networks , 1981 .

[68]  Luca Vogt Statistics For Spatial Data , 2016 .