Spatiotemporal Satellite Image Fusion Using Deep Convolutional Neural Networks

We propose a novel spatiotemporal fusion method based on deep convolutional neural networks (CNNs) under the application background of massive remote sensing data. In the training stage, we build two five-layer CNNs to deal with the problems of complicated correspondence and large spatial resolution gaps between MODIS and Landsat images. Specifically, we first learn a nonlinear mapping CNN between MODIS and low-spatial-resolution (LSR) Landsat images and then learn a super-resolution CNN between LSR Landsat and original Landsat images. In the prediction stage, instead of directly taking the outputs of CNNs as the fusion result, we design a fusion model consisting of high-pass modulation and a weighting strategy to make full use of the information in prior images. Specifically, we first map the input MODIS images to transitional images via the learned nonlinear mapping CNN and further improve the transitional images to LSR Landsat images via the fusion model; then, via the learned SR CNN, the LSR Landsat images are supersolved to transitional images, which are further improved to Landsat images via the fusion model. Compared with the previous learning-based fusion methods, mainly referring to the sparse-representation-based methods, our CNNs-based spatiotemporal method has the following advantages: 1) automatically extracting effective image features; 2) learning an end-to-end mapping between MODIS and LSR Landsat images; and 3) generating more favorable fusion results. To examine the performance of the proposed fusion method, we conduct experiments on two representative Landsat–MODIS datasets by comparing with the sparse-representation-based spatiotemporal fusion model. The quantitative evaluations on all possible prediction dates and the comparison of fusion results on one key date in both visual effect and quantitative evaluations demonstrate that the proposed method can generate more accurate fusion results.

[1]  Bo Huang,et al.  Spatiotemporal Satellite Image Fusion Through One-Pair Image Learning , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[2]  Liang-pei Zhang,et al.  Integrated fusion of multi-scale polar-orbiting and geostationary satellite observations for the mapping of high spatial and temporal resolution land surface temperature , 2015 .

[3]  Tim R. McVicar,et al.  Assessing the accuracy of blending Landsat–MODIS surface reflectances in two landscapes with contrasting spatial and temporal dynamics: A framework for algorithm selection , 2013 .

[4]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[5]  Joanne C. White,et al.  Generation of dense time series synthetic Landsat data through data blending with MODIS using a spatial and temporal adaptive reflectance fusion model. , 2009 .

[6]  Xiaolin Zhu,et al.  An enhanced spatial and temporal adaptive reflectance fusion model for complex heterogeneous regions , 2010 .

[7]  Bo Huang,et al.  Spatiotemporal Reflectance Fusion via Sparse Representation , 2012, IEEE Transactions on Geoscience and Remote Sensing.

[8]  Tinghua Ai,et al.  A spatial and temporal reflectance fusion model considering sensor observation differences , 2013 .

[9]  Shanti Reddy,et al.  An Evaluation of the Use of Atmospheric and BRDF Correction to Standardize Landsat Data , 2010, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Liangpei Zhang,et al.  An Integrated Framework for the Spatio–Temporal–Spectral Fusion of Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Jiashi Feng,et al.  Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution. , 2017, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[15]  Michael E. Schaepman,et al.  Unmixing-Based Landsat TM and MERIS FR Data Fusion , 2008, IEEE Geoscience and Remote Sensing Letters.

[16]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[17]  Jocelyn Chanussot,et al.  Pansharpening Quality Assessment Using the Modulation Transfer Functions of Instruments , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[18]  Liangpei Zhang,et al.  A Spatial and Temporal Nonlocal Filter-Based Data Fusion Method , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[19]  J. Boardman,et al.  Discrimination among semi-arid landscape endmembers using the Spectral Angle Mapper (SAM) algorithm , 1992 .

[20]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[21]  Kenneth J. Ranson,et al.  Disturbance recognition in the boreal forest using radar and Landsat-7 , 2003 .

[22]  Joanne C. White,et al.  A new data fusion model for high spatial- and temporal-resolution mapping of forest disturbance based on Landsat and MODIS , 2009 .

[23]  Peter M. Atkinson,et al.  Enhancing Spatio-Temporal Fusion of MODIS and Landsat Data by Incorporating 250 m MODIS Data , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[24]  Liangpei Zhang,et al.  An Online Coupled Dictionary Learning Approach for Remote Sensing Image Fusion , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[25]  Gail P. Anderson,et al.  MODTRAN4 radiative transfer modeling for atmospheric correction , 1999, Optics & Photonics.

[26]  Curtis E. Woodcock,et al.  Trends in Land Cover Mapping and Monitoring , 2012 .

[27]  Jocelyn Chanussot,et al.  Synthesis of Multispectral Images to High Spatial Resolution: A Critical Review of Fusion Methods Based on Remote Sensing Physics , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[28]  Mathew R. Schwaller,et al.  On the blending of the Landsat and MODIS surface reflectance: predicting daily Landsat surface reflectance , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[29]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  W. Cohen,et al.  North American forest disturbance mapped from a decadal Landsat record , 2008 .

[31]  H. Sebastian Seung,et al.  Natural Image Denoising with Convolutional Networks , 2008, NIPS.

[32]  Liangpei Zhang,et al.  Two-Step Sparse Coding for the Pan-Sharpening of Remote Sensing Images , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[33]  M. Schaepman,et al.  Downscaling time series of MERIS full resolution data to monitor vegetation seasonal dynamics , 2009 .

[34]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.