Greenhouse Segmentation on High-Resolution Optical Satellite Imagery using Deep Learning Techniques

Greenhouse segmentation has pivotal importance for climate-smart agricultural land-use planning. Deep learning-based approaches provide state-of-the-art performance in natural image segmentation. However, semantic segmentation on high-resolution optical satellite imagery is a challenging task because of the complex environment. In this paper, a sound methodology is proposed for pixel-wise classification on images acquired by the Azersky (SPOT-7) optical satellite. In particular, customized variations of U-Net-like architectures are employed to identify greenhouses. Two models are proposed which uniquely incorporate dilated convolutions and skip connections, and the results are compared to that of the baseline U-Net model. The dataset used consists of pan-sharpened orthorectified Azersky images (red, green, blue,and near infrared channels) with 1.5-meter resolution and annotation masks, collected from 15 regions in Azerbaijan where the greenhouses are densely congested. The images cover the cumulative area of 1008 $km^2$ and annotation masks contain 47559 polygons in total. The $F_1, Kappa, AUC$, and $IOU$ scores are used for performance evaluation. It is observed that the use of the deconvolutional layers alone throughout the expansive path does not yield satisfactory results; therefore, they are either replaced or coupled with bilinear interpolation. All models benefit from the hard example mining (HEM) strategy. It is also reported that the best accuracy of $93.29\%$ ($F_1\,score$) is recorded when the weighted binary cross-entropy loss is coupled with the dice loss. Experimental results showed that both of the proposed models outperformed the baseline U-Net architecture such that the best model proposed scored $4.48\%$ higher in comparison to the baseline architecture.

[1]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[2]  Wei Li,et al.  DeepUNet: A Deep Fully Convolutional Network for Pixel-Level Sea-Land Segmentation , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[5]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[7]  Alexandr A. Kalinin,et al.  Albumentations: fast and flexible image augmentations , 2018, Inf..

[8]  John D. Austin,et al.  Adaptive histogram equalization and its variations , 1987 .

[9]  Dilek Koc-San,et al.  Plastic and Glass Greenhouses Detection and Delineation from WORLDVIEW-2 Satellite Imagery , 2016 .

[10]  Thomas A. Funkhouser,et al.  Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Filiz Bektas Balcik,et al.  Greenhouse Mapping using Object Based Classification and Sentinel-2 Satellite Imagery , 2019, 2019 8th International Conference on Agro-Geoinformatics (Agro-Geoinformatics).

[12]  Vincent Dumoulin,et al.  Deconvolution and Checkerboard Artifacts , 2016 .

[13]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Keisuke Nemoto,et al.  Effective Use of Dilated Convolutions for Segmenting Small Object Instances in Remote Sensing Imagery , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[15]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[16]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Xiaoqi Li,et al.  Towards Accurate High Resolution Satellite Image Semantic Segmentation , 2019, IEEE Access.

[18]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[19]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[20]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[21]  Jean-Michel Morel,et al.  Non-Local Means Denoising , 2011, Image Process. Line.

[22]  Jiaming Liu,et al.  Accuracy Improvement of UNet Based on Dilated Convolution , 2019 .