Augmenting correlation structures in spatial data using deep generative models

State-of-the-art deep learning methods have shown a remarkable capacity to model complex data domains, but struggle with geospatial data. In this paper, we introduce SpaceGAN, a novel generative model for geospatial domains that learns neighbourhood structures through spatial conditioning. We propose to enhance spatial representation beyond mere spatial coordinates, by conditioning each data point on feature vectors of its spatial neighbours, thus allowing for a more flexible representation of the spatial structure. To overcome issues of training convergence, we employ a metric capturing the loss in local spatial autocorrelation between real and generated data as stopping criterion for SpaceGAN parametrization. This way, we ensure that the generator produces synthetic samples faithful to the spatial patterns observed in the input. SpaceGAN is successfully applied for data augmentation and outperforms compared to other methods of synthetic spatial data generation. Finally, we propose an ensemble learning framework for the geospatial domain, taking augmented SpaceGAN samples as training data for a set of ensemble learners. We empirically show the superiority of this approach over conventional ensemble learning approaches and rivaling spatial data augmentation methods, using synthetic and real-world prediction tasks. Our findings suggest that SpaceGAN can be used as a tool for (1) artificially inflating sparse geospatial data and (2) improving generalization of geospatial models.

[1]  Thomas Serre,et al.  Learning long-range spatial dependencies with horizontal gated-recurrent units , 2018, NeurIPS.

[2]  S. Basu,et al.  Analysis of Spatial Autocorrelation in House Prices , 1998 .

[3]  Yang Zhang,et al.  Point Cloud GAN , 2018, DGS@ICLR.

[4]  Philip C. Treleaven,et al.  Generative adversarial networks for financial trading strategies fine-tuning and combination , 2019, Quantitative Finance.

[5]  R. Pace,et al.  Sparse spatial autoregressions , 1997 .

[6]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[7]  Karsten Müller,et al.  Soccer Jersey Number Recognition Using Convolutional Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[8]  W. F. Krajewski,et al.  Spatial rainfall estimation by linear and non-linear co-kriging of radar-rainfall and raingage data , 1989 .

[9]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[11]  Hayit Greenspan,et al.  Synthetic data augmentation using GAN for improved liver lesion classification , 2018, 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018).

[12]  Francisco Martínez-Álvarez,et al.  A novel ensemble modeling approach for the spatial prediction of tropical forest fire susceptibility using LogitBoost machine learning classifier and multi-source geospatial data , 2018, Theoretical and Applied Climatology.

[13]  Yang Wang,et al.  MARTA GANs: Unsupervised Representation Learning for Remote Sensing Image Classification , 2016, IEEE Geoscience and Remote Sensing Letters.

[14]  Stefano Ermon,et al.  Tile2Vec: Unsupervised representation learning for spatially distributed data , 2018, AAAI.

[15]  Geoff S. Nitschke,et al.  Improving Deep Learning with Generic Data Augmentation , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[16]  P. Moran Notes on continuous stochastic phenomena. , 1950, Biometrika.

[17]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[18]  Jukka Heikkonen,et al.  Estimating the prediction performance of spatial models via spatial k-fold cross validation , 2017, Int. J. Geogr. Inf. Sci..

[19]  Andrew Gordon Wilson,et al.  GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration , 2018, NeurIPS.

[20]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Xin Yao,et al.  Evolutionary Generative Adversarial Networks , 2018, IEEE Transactions on Evolutionary Computation.

[22]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[23]  Joachim Denzler,et al.  Deep learning and process understanding for data-driven Earth system science , 2019, Nature.

[24]  Sudipto Banerjee,et al.  Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets , 2014, Journal of the American Statistical Association.

[25]  Kilian Q. Weinberger,et al.  Snapshot Ensembles: Train 1, get M for free , 2017, ICLR.

[26]  Dean C. Barratt,et al.  Freehand Ultrasound Image Simulation with Spatially-Conditioned Generative Adversarial Networks , 2017, CMMI/RAMBO/SWITCH@MICCAI.

[27]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[28]  Minh N. Do,et al.  Fast Guided Global Interpolation for Depth and Motion , 2016, ECCV.

[29]  Ryan P. Adams,et al.  Factorized Point Process Intensities: A Spatial Analysis of Professional Basketball , 2014, ICML.

[30]  Alexander Brenning,et al.  Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: The R package sperrorest , 2012, 2012 IEEE International Geoscience and Remote Sensing Symposium.

[31]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Luc Anselin,et al.  A Local Indicator of Multivariate Spatial Association: Extending Geary's c , 2018, Geographical Analysis.

[33]  Yan Li,et al.  Spatial Ensemble Learning for Heterogeneous Geographic Data with Class Ambiguity , 2019, ACM Trans. Intell. Syst. Technol..

[34]  Yiorgos Makris,et al.  Handling discontinuous effects in modeling spatial correlation of wafer-level analog/RF tests , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[35]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[36]  L. Anselin Local Indicators of Spatial Association—LISA , 2010 .

[37]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[38]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[39]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[40]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[41]  Liang Chen,et al.  GAN Augmentation: Augmenting Training Data using Generative Adversarial Networks , 2018, ArXiv.

[42]  Jerome P. Reiter,et al.  Bayesian marked point process modeling for generating fully synthetic public use data with point-referenced geography , 2014, 1407.7795.

[43]  Wei Wei,et al.  COCO-GAN: Generation by Parts via Conditional Coordinating , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44]  Mark J. van der Laan,et al.  Optimal Spatial Prediction Using Ensemble Machine Learning , 2016, The international journal of biostatistics.

[45]  Le Song,et al.  Wasserstein Learning of Deep Generative Point Process Models , 2017, NIPS.

[46]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[47]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[48]  Yu Liu,et al.  Spatial interpolation using conditional generative adversarial neural networks , 2019, Int. J. Geogr. Inf. Sci..