Automatic water detection from multidimensional hierarchical clustering for Sentinel-2 images and a comparison with Level 2A processors

Abstract Continuous monitoring of water surfaces is essential for water resource management. This study presents a nonparametric unsupervised automatic algorithm for the identification of inland water pixels from multispectral satellite data using multidimensional clustering and a high-performance subsampling approach for large scenes. Clustering analysis is a technique that is used to identify similar samples in a multidimensional data space. The spectral information and derived indices were used to characterize each scene pixel individually. A machine learning approach with random subsampling and generalization through a Naive Bayes classifier was also proposed to make the application of complex algorithms to large scenes feasible. Accuracy was evaluated using an independent dataset that provides water bodies in 15 Sentinel-2 images over France acquired in different seasons and that covers a large range of water bodies and water colour types. The validation dataset covers a water surface of more than 1200 km2 (approximately 12 million pixels) including over 80,000 water bodies outlined using a semiautomatic active learning method, which were manually revised. The classification results were compared to the water pixel classification using three of the major Level 2A processors (MAJA, Sen2Cor and FMask) and two of the most common thresholding techniques: Otsu and Canny-edge. An input mask was used to remove coastal waters, clouds, shadows and snow pixels. Water pixels were identified automatically from the clustering process without the need for ancillary or pretrained data. Combinations using up to three water indices (Modified Normalized Difference Water Index-MNDWI, Normalized Difference Water Index-NDWI and Multiband Water Index-MBWI) and two reflectance bands (B8 and B12) were tested in the algorithm, and the best combination was NDWI-B12. Of all the methods, our method achieved the highest mean kappa score, 0.874, across all tested scenes, with a per-scene kappa ranging from 0.608 to 0.980, and the lowest mean standard deviation of 0.091. Standard Otsu's thresholding had the worst performance due to the lack of a bimodal histogram, and the Canny-edge variation achieved an overall kappa of 0.718 when used with the MNDWI. For water masks provided by generic processors, FMask outperformed MAJA and Sen2Cor and obtained an overall kappa of 0.764. In-depth analysis shows a quick drop in performance for all of the methods in identifying water bodies with a surface area below 0.5 ha, but the proposed approach outperformed the second best method by 34% in this size class.

[1]  Min Feng,et al.  A global, high-resolution (30-m) inland water body dataset for 2000: first results of a topographic–spectral classification algorithm , 2016, Int. J. Digit. Earth.

[2]  Jae Kang Lee,et al.  Identification of Water Bodies in a Landsat 8 OLI Image Using a J48 Decision Tree , 2016, Sensors.

[3]  Kavita Shah,et al.  Floodplain Mapping through Support Vector Machine and Optical/Infrared Images from Landsat 8 OLI/TIRS Sensors: Case Study from Varanasi , 2017, Water Resources Management.

[4]  Walter H. F. Smith,et al.  A global, self‐consistent, hierarchical, high‐resolution shoreline database , 1996 .

[5]  ByoungChul Ko,et al.  Classification of Potential Water Bodies Using Landsat 8 OLI and a Combination of Two Boosted Random Forest Classifiers , 2015, Sensors.

[6]  Cédric Jamet,et al.  Coastal and inland water pixels extraction algorithm (WiPE) from spectral shape analysis and HSV transformation applied to Landsat 8 OLI and Sentinel-2 MSI , 2019, Remote Sensing of Environment.

[7]  C. Woodcock,et al.  Improvement and expansion of the Fmask algorithm: cloud, cloud shadow, and snow detection for Landsats 4–7, 8, and Sentinel 2 images , 2015 .

[8]  Brigitte Poulin,et al.  Automatic Inundation Mapping Using Sentinel-2 Data Applicable to Both Camargue and Doñana Biosphere Reserves , 2019, Remote. Sens..

[9]  Rasmus Fensholt,et al.  Automated Water Extraction Index: A new technique for surface water mapping using Landsat imagery , 2014 .

[10]  Dong Ha Lee,et al.  Evaluation of Water Indices for Surface Water Extraction in a Landsat 8 Scene of Nepal , 2018, Sensors.

[11]  Yue Yu,et al.  Water-Quality Classification of Inland Lakes Using Landsat8 Images by Convolutional Neural Networks , 2019, Remote. Sens..

[12]  Santiago Yepez,et al.  Retrieval of suspended sediment concentrations using Landsat-8 OLI satellite images in the Orinoco River (Venezuela) , 2017 .

[13]  Sandro Martinis,et al.  A Modular Processing Chain for Automated Flood Monitoring from Multi-Spectral Satellite Data , 2019, Remote. Sens..

[14]  Elmar Eisemann,et al.  A 30 m Resolution Surface Water Mask Including Estimation of Positional and Thematic Differences Using Landsat 8, SRTM and OpenStreetMap: A Case Study in the Murray-Darling Basin, Australia , 2016, Remote. Sens..

[15]  Ali El-Zaart,et al.  Automatic Thresholding Techniques for SAR Images , 2013, CSE 2013.

[16]  Olivier Hagolle,et al.  Theia Snow collection: high-resolution operational snow cover maps from Sentinel-2 and Landsat-8 data , 2019, Earth System Science Data.

[17]  Zheng Duan,et al.  A robust Multi-Band Water Index (MBWI) for automated extraction of surface water from Landsat 8 OLI imagery , 2018, International Journal of Applied Earth Observation and Geoinformation.

[18]  Júlia G. Ribeiro,et al.  Long-Term Annual Surface Water Change in the Brazilian Amazon Biome: Potential Links with Deforestation, Infrastructure Development and Climate Change , 2019, Water.

[19]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[20]  Zhe Zhu,et al.  Object-based cloud and cloud shadow detection in Landsat imagery , 2012 .

[21]  Xiaodong Li,et al.  Water Bodies' Mapping from Sentinel-2 Imagery with Modified Normalized Difference Water Index at 10-m Spatial Resolution Produced by Sharpening the SWIR Band , 2016, Remote. Sens..

[22]  C. Verpoorter,et al.  A global inventory of lakes based on high‐resolution satellite imagery , 2014 .

[23]  P. Rama Chandra Prasad,et al.  Automatic Extraction of Water Bodies from Landsat Imagery Using Perceptron Model , 2015 .

[24]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[25]  Sabhia Firdaus,et al.  A Survey on Clustering Algorithms and Complexity Analysis , 2015 .

[26]  Gary R. Watmough,et al.  Evaluating the capabilities of Sentinel-2 for quantitative estimation of biophysical variables in vegetation , 2013 .

[27]  Stefan Voigt,et al.  Unsupervised Extraction of Flood-Induced Backscatter Changes in SAR Data Using Markov Image Modeling on Irregular Graphs , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[28]  Emmanouil N. Anagnostou,et al.  Inundation Extent Mapping by Synthetic Aperture Radar: A Review , 2019, Remote. Sens..

[29]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[30]  Lucy Bastin,et al.  A near real-time water surface detection method based on HSV transformation of MODIS multi-spectral time series data , 2014 .

[31]  Ioannis Manakos,et al.  Fast and Automatic Data-Driven Thresholding for Inundation Mapping with Sentinel-2 Data , 2018, Remote. Sens..

[32]  Huping Ye,et al.  A simple automated dynamic threshold extraction method for the classification of large water bodies from landsat-8 OLI water index images , 2018 .

[33]  B. He,et al.  Fmask 4.0: Improved cloud and cloud shadow detection in Landsats 4–8 and Sentinel-2 imagery , 2019, Remote Sensing of Environment.

[34]  Jean-Michel Martinez,et al.  Assessment of Chlorophyll-a Remote Sensing Algorithms in a Productive Tropical Estuarine-Lagoon System , 2017, Remote. Sens..

[35]  Rita de Cássia Condé,et al.  Indirect Assessment of Sedimentation in Hydropower Dams Using MODIS Remote Sensing Images , 2019, Remote. Sens..

[36]  Pierre Grussenmeyer,et al.  Urban surface water body detection with suppressed built-up noise based on water indices from Sentinel-2 MSI imagery , 2018, Remote Sensing of Environment.

[37]  Ave Ansper,et al.  Retrieval of Chlorophyll a from Sentinel-2 MSI Data for the European Union Water Framework Directive Reporting Purposes , 2018, Remote. Sens..

[38]  Olivier Hagolle,et al.  Validation of Copernicus Sentinel-2 Cloud Masks Obtained from MAJA, Sen2Cor, and FMask Processors Using Reference Cloud Masks Generated with a Supervised Active Learning Procedure , 2019, Remote. Sens..

[39]  Hamid A. Jalab,et al.  WATER-BODY SEGMENTATION IN SATELLITE IMAGERY APPLYING MODIFIED KERNEL KMEANS , 2018 .

[40]  Ugur Avdan,et al.  Object-based water body extraction model using Sentinel-2 satellite imagery , 2017 .

[41]  David Saah,et al.  On the merging of optical and SAR satellite imagery for surface water mapping applications , 2018 .

[42]  Jean-Loup Guyot,et al.  Increase in suspended sediment discharge of the Amazon River assessed by monitoring network and satellite data , 2009 .

[43]  Yan Peng,et al.  Multilayer Perceptron Neural Network for Surface Water Extraction in Landsat 8 OLI Satellite Images , 2018, Remote. Sens..

[44]  J. Pekel,et al.  High-resolution mapping of global surface water and its long-term changes , 2016, Nature.

[45]  S. K. McFeeters The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features , 1996 .

[46]  Hanqiu Xu Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery , 2006 .

[47]  Y. Yamashiki,et al.  CHLOROPHYLL ESTIMATION OF LAKE WATER AND COASTAL WATER USING LANDSAT-8 AND SENTINEL-2A SATELLITE , 2019, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[48]  Tiit Kutser,et al.  First Experiences in Mapping Lake Water Quality Parameters with Sentinel-2 MSI Imagery , 2016, Remote. Sens..

[49]  B. Wylie,et al.  Analysis of Dynamic Thresholds for the Normalized Difference Water Index , 2009 .

[50]  Gérard Dedieu,et al.  A multi-temporal method for cloud detection, applied to FORMOSAT-2, VENµS, LANDSAT and SENTINEL-2 images , 2010 .

[51]  Emmanuelle Gouillart,et al.  scikit-image: image processing in Python , 2014, PeerJ.

[52]  J. Moreno,et al.  Empirical model for chlorophyll-a determination in inland waters from the forthcoming Sentinel-2 and 3. Validation from HICO images , 2014 .

[53]  Yi Li,et al.  Flood Mapping Based on Multiple Endmember Spectral Mixture Analysis and Random Forest Classifier - The Case of Yuyao, China , 2015, Remote. Sens..

[54]  Sang-Il Lee,et al.  Recent Surface Water Extent of Lake Chad from Multispectral Sensors and GRACE , 2018, Sensors.

[55]  Massimo Menenti,et al.  Comparing Thresholding with Machine Learning Classifiers for Mapping Complex Water , 2019, Remote. Sens..

[56]  Xiaoqing Wu,et al.  Comparison of surface water extraction performances of different classic water indices using OLI and TM imageries in different situations , 2015, Geo spatial Inf. Sci..

[57]  Huihui Song,et al.  Water Identification from High-Resolution Remote Sensing Images Based on Multidimensional Densely Connected Convolutional Neural Networks , 2020, Remote. Sens..

[58]  R. Bukata Retrospection and introspection on remote sensing of inland water quality: “Like Déjà Vu All Over Again” , 2013 .

[59]  Na Zhao,et al.  Mapping of Urban Surface Water Bodies from Sentinel-2 MSI Imagery at 10 m Resolution via NDWI-Based Image Sharpening , 2017, Remote. Sens..

[60]  Luis Guanter,et al.  Ready-to-Use Methods for the Detection of Clouds, Cirrus, Snow, Shadow, Water and Clear Sky Pixels in Sentinel-2 MSI Images , 2016, Remote. Sens..