A Scalable Machine Learning Pipeline for Paddy Rice Classification Using Multi-Temporal Sentinel Data

The demand for rice production in Asia is expected to increase by 70% in the next 30 years, which makes evident the need for a balanced productivity and effective food security management at a national and continental level. Consequently, the timely and accurate mapping of paddy rice extent and its productivity assessment is of utmost significance. In turn, this requires continuous area monitoring and large scale mapping, at the parcel level, through the processing of big satellite data of high spatial resolution. This work designs and implements a paddy rice mapping pipeline in South Korea that is based on a time-series of Sentinel-1 and Sentinel-2 data for the year of 2018. There are two challenges that we address; the first one is the ability of our model to manage big satellite data and scale for a nationwide application. The second one is the algorithm’s capacity to cope with scarce labeled data to train supervised machine learning algorithms. Specifically, we implement an approach that combines unsupervised and supervised learning. First, we generate pseudo-labels for rice classification from a single site (Seosan-Dangjin) by using a dynamic k-means clustering approach. The pseudo-labels are then used to train a Random Forest (RF) classifier that is fine-tuned to generalize in two other sites (Haenam and Cheorwon). The optimized model was then tested against 40 labeled plots, evenly distributed across the country. The paddy rice mapping pipeline is scalable as it has been deployed in a High Performance Data Analytics (HPDA) environment using distributed implementations for both k-means and RF classifiers. When tested across the country, our model provided an overall accuracy of 96.69% and a kappa coefficient 0.87. Even more, the accurate paddy rice area mapping was returned early in the year (late July), which is key for timely decision-making. Finally, the performance of the generalized paddy rice classification model, when applied in the sites of Haenam and Cheorwon, was compared to the performance of two equivalent models that were trained with locally sampled labels. The results were comparable and highlighted the success of the model’s generalization and its applicability to other regions.

[1]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[2]  Jinwei Dong,et al.  Mapping paddy rice planting area in cold temperate climate region through analysis of time series Landsat 8 (OLI), Landsat 7 (ETM+) and MODIS imagery. , 2015, ISPRS journal of photogrammetry and remote sensing : official publication of the International Society for Photogrammetry and Remote Sensing.

[3]  Ning Jin,et al.  The Effect of NDVI Time Series Density Derived from Spatiotemporal Fusion of Multisource Remote Sensing Data on Crop Classification Accuracy , 2019, ISPRS Int. J. Geo Inf..

[4]  Seungtaek Jeong,et al.  Monitoring paddy productivity in North Korea employing geostationary satellite images integrated with GRAMI-rice model , 2018, Scientific Reports.

[5]  Stéphane Dupuy,et al.  A Combined Random Forest and OBIA Classification Scheme for Mapping Smallholder Agriculture at Different Nomenclature Levels Using Multisource Data (Simulated Sentinel-2 Time Series, VHRS and DEM) , 2017, Remote. Sens..

[6]  Li Wang,et al.  Mapping Early, Middle and Late Rice Extent Using Sentinel-1A and Landsat-8 Data in the Poyang Lake Plain, China , 2018, Sensors.

[7]  Christopher O. Justice,et al.  Estimating Global Cropland Extent with Multi-year MODIS Data , 2010, Remote. Sens..

[8]  Changsheng Li,et al.  Mapping paddy rice agriculture in southern China using multi-temporal MODIS images , 2005 .

[9]  Peng Gong,et al.  Global land cover mapping using Earth observation satellite data: Recent progresses and challenges , 2015 .

[10]  Changsheng Li,et al.  Mapping paddy rice agriculture in South and Southeast Asia using multi-temporal MODIS images , 2006 .

[11]  Gérard Dedieu,et al.  Assessment of an Operational System for Crop Type Map Production Using High Temporal and Spatial Resolution Satellite Optical Imagery , 2015, Remote. Sens..

[12]  Xin Niu,et al.  Multi-temporal RADARSAT-2 polarimetric SAR data for urban land-cover classification using an object-based support vector machine and a rule-based approach , 2013 .

[13]  Li Chen,et al.  Mapping Paddy Rice Using Weakly Supervised Long Short-Term Memory Network with Time Series Sentinel Optical and SAR Images , 2020 .

[14]  Christopher O. Justice,et al.  A Framework for Defining Spatially Explicit Earth Observation Requirements for a Global Agricultural Monitoring Initiative (GEOGLAM) , 2015, Remote. Sens..

[15]  Seungtaek Jeong,et al.  Nationwide Projection of Rice Yield Using a Crop Model Integrated with Geostationary Satellite Imagery: A Case Study in South Korea , 2018, Remote. Sens..

[16]  Jinwei Dong,et al.  Mapping paddy rice planting area in northeastern Asia with Landsat 8 images, phenology-based algorithm and Google Earth Engine. , 2016, Remote sensing of environment.

[17]  D. Amarsaikhan,et al.  Comparison of multisource image fusion methods and land cover classification , 2012 .

[18]  Michele Meroni,et al.  ASAP: A new global early warning system to detect anomaly hot spots of agricultural production for food security analysis , 2019, Agricultural systems.

[19]  Classification of Agroclimatic Zones Considering the Topography Characteristics in South Korea , 2016 .

[20]  Jungho Im,et al.  Classification and Mapping of Paddy Rice by Combining Landsat and SAR Time Series Data , 2018, Remote. Sens..

[21]  Minseok Kang,et al.  Modeling gross primary production of paddy rice cropland through analyses of data from CO2 eddy flux tower sites and MODIS images , 2017 .

[22]  Ioannis Papoutsis,et al.  Scalable Parcel-Based Crop Identification Scheme Using Sentinel-2 Data Time-Series for the Monitoring of the Common Agricultural Policy , 2018, Remote. Sens..

[23]  Y. Ryu,et al.  Evaluation of land surface radiation balance derived from moderate resolution imaging spectroradiometer (MODIS) over complex terrain and heterogeneous landscape on clear sky days , 2008 .

[24]  A. Gitelson,et al.  Non‐destructive optical detection of pigment changes during leaf senescence and fruit ripening , 1999 .

[25]  Wadii Boulila,et al.  Big Data: Concepts, Challenges and Applications , 2015, ICCCI.

[26]  Jiaguo Qi,et al.  Monitoring Rice Agriculture across Myanmar Using Time Series Sentinel-1 Assisted by Landsat-8 and PALSAR-2 , 2017, Remote. Sens..

[27]  John H. Prueger,et al.  Value of Using Different Vegetative Indices to Quantify Agricultural Crop Characteristics at Different Growth Stages under Varying Management Practices , 2010, Remote. Sens..

[28]  S. Muthayya,et al.  An overview of global rice production, supply, trade, and consumption , 2014, Annals of the New York Academy of Sciences.

[29]  A. Schneider,et al.  Mapping rice paddy extent and intensification in the Vietnamese Mekong River Delta with dense time stacks of Landsat data , 2015 .

[30]  Clement Atzberger,et al.  First Experience with Sentinel-2 Data for Crop and Tree Species Classifications in Central Europe , 2016, Remote. Sens..

[31]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[32]  Luca Gatti,et al.  Towards an Operational SAR-Based Rice Monitoring System in Asia: Examples from 13 Demonstration Sites across Asia in the RIICE Project , 2014, Remote. Sens..

[33]  Prasad S. Thenkabail,et al.  Mapping rice areas of South Asia using MODIS multitemporal data , 2011 .

[34]  B. Haack,et al.  Classification of California agriculture using quad polarization radar data and Landsat Thematic Mapper data , 2013 .

[35]  Seongjoon Kim,et al.  Correlation Analysis between Air Temperature and MODIS Land Surface Temperature and Prediction of Air Temperature Using TensorFlow Long Short-Term Memory for the Period of Occurrence of Cold and Heat Waves , 2020, Remote. Sens..

[36]  C. Tucker Red and photographic infrared linear combinations for monitoring vegetation , 1979 .

[37]  Dailiang Peng,et al.  Detection and estimation of mixed paddy rice cropping patterns with MODIS data , 2011, Int. J. Appl. Earth Obs. Geoinformation.

[38]  Thomas Koellner,et al.  Crop selection under price and yield fluctuation: Analysis of agro-economic time series from South Korea , 2016 .

[39]  Claire Marais-Sicre,et al.  Improved Early Crop Type Identification By Joint Use of High Temporal Resolution SAR And Optical Image Time Series , 2016, Remote. Sens..

[40]  Enrico Cadau,et al.  SENTINEL-2 SEN2COR: L2A Processor for Users , 2016 .

[41]  Mark Sullivan,et al.  Monitoring Global Croplands with Coarse Resolution Earth Observations: The Global Agriculture Monitoring (GLAM) Project , 2010, Remote. Sens..

[42]  Jinwei Dong,et al.  Mapping paddy rice planting areas through time series analysis of MODIS land surface temperature and vegetation index data. , 2015, ISPRS journal of photogrammetry and remote sensing : official publication of the International Society for Photogrammetry and Remote Sensing.