Prediction of Winter Wheat Yield Based on Multi-Source Data and Machine Learning in China

Wheat is one of the main crops in China, and crop yield prediction is important for regional trade and national food security. There are increasing concerns with respect to how to integrate multi-source data and employ machine learning techniques to establish a simple, timely, and accurate crop yield prediction model at an administrative unit. Many previous studies were mainly focused on the whole crop growth period through expensive manual surveys, remote sensing, or climate data. However, the effect of selecting different time window on yield prediction was still unknown. Thus, we separated the whole growth period into four time windows and assessed their corresponding predictive ability by taking the major winter wheat production regions of China as an example in the study. Firstly we developed a modeling framework to integrate climate data, remote sensing data and soil data to predict winter wheat yield based on the Google Earth Engine (GEE) platform. The results show that the models can accurately predict yield 1~2 months before the harvesting dates at the county level in China with an R² > 0.75 and yield error less than 10%. Support vector machine (SVM), Gaussian process regression (GPR), and random forest (RF) represent the top three best methods for predicting yields among the eight typical machine learning models tested in this study. In addition, we also found that different agricultural zones and temporal training settings affect prediction accuracy. The three models perform better as more winter wheat growing season information becomes available. Our findings highlight a potentially powerful tool to predict yield using multiple-source data and machine learning in other regions and for crops.

[1]  R. Dennis Cook,et al.  Cross-Validation of Regression Models , 1984 .

[2]  F. Tao,et al.  Future extreme temperature and its impact on rice yield in China , 2017 .

[3]  Tim R. McVicar,et al.  Prediction of crop yield, water consumption and water use efficiency with a SVAT-crop growth model using remotely sensed data on the North China Plain , 2005 .

[4]  James W. Jones,et al.  The DSSAT cropping system model , 2003 .

[5]  Fulu Tao,et al.  A review on statistical models for identifying climate contributions to crop yields , 2013, Journal of Geographical Sciences.

[6]  Zhenping Yang,et al.  Response of yield increase for dryland winter wheat to tillage practice during summer fallow and sowing method in the Loess Plateau of China , 2018 .

[7]  J. Wolf,et al.  WOFOST: a simulation model of crop production. , 1989 .

[8]  Gary E. Varvel,et al.  Use of Remote-Sensing Imagery to Estimate Corn Grain Yield , 2001 .

[9]  Brett Whelan,et al.  Establishing Management Classes for Broadacre Agricultural Production , 2007 .

[10]  Reimund P. Rötter,et al.  Heat stress impacts on wheat growth and yield were reduced in the Huang-Huai-Hai Plain of China in the past three decades , 2015 .

[11]  T. Gaiser,et al.  Simulating canopy temperature for modelling heat stress in cereals , 2016, Environ. Model. Softw..

[12]  Xiaojun Liu,et al.  Climate change impacts on regional winter wheat production in main wheat production regions of China , 2013 .

[13]  A. Skidmore,et al.  Narrow band vegetation indices overcome the saturation problem in biomass estimation , 2004 .

[14]  M. Yokozawa,et al.  Modelling the impacts of weather and climate variability on crop productivity over a large area: A new process-based model development, optimization, and uncertainties analysis , 2009 .

[15]  M. J. Pringle,et al.  An empirical model for prediction of wheat yield, using time-integrated Landsat NDVI , 2018, Int. J. Appl. Earth Obs. Geoinformation.

[16]  P. Zarco-Tejada,et al.  A Novel Remote Sensing Approach for Prediction of Maize Yield Under Different Conditions of Nitrogen Fertilization , 2016, Front. Plant Sci..

[17]  Ehsan Eyshi Rezaei,et al.  Weather impacts on crop yields - searching for simple answers to a complex problem , 2017 .

[18]  Marco Bindi,et al.  Diverging importance of drought stress for maize and winter wheat in Europe , 2018, Nature Communications.

[19]  Paul C. Doraiswamy,et al.  Spring Wheat Yield Assessment Using NOAA AVHRR Data , 1995 .

[20]  Mohsen Azadbakht,et al.  Machine Learning Regression Techniques for the Silage Maize Yield Prediction Using Time-Series Images of Landsat 8 OLI , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[21]  Fulu Tao,et al.  Climate trends and crop production in China at county scale, 1980 to 2008 , 2015, Theoretical and Applied Climatology.

[22]  P. Shewry,et al.  Modelling predicts that heat stress, not drought, will increase vulnerability of wheat in Europe , 2011, Scientific reports.

[23]  D. Lobell,et al.  A meta-analysis of crop yield under climate change and adaptation , 2014 .

[24]  M. Wang,et al.  Single rice growth period was prolonged by cultivars shifts, but yield was damaged by climate change during 1981–2009 in China, and late rice was just opposite , 2013, Global change biology.

[25]  D. Pontier,et al.  Assessing the performance of NDVI as a proxy for plant biomass using non-linear models: a case study on the Kerguelen archipelago , 2009, Polar Biology.

[26]  Herman Eerens,et al.  Empirical regression models using NDVI, rainfall and temperature data for the early prediction of wheat grain yields in Morocco , 2008, Int. J. Appl. Earth Obs. Geoinformation.

[27]  Ying LU,et al.  Decision tree methods: applications for classification and prediction , 2015, Shanghai archives of psychiatry.

[28]  Dehai Zhu,et al.  Improving winter wheat yield estimation by assimilation of the leaf area index from Landsat TM and MODIS data into the WOFOST model , 2015 .

[29]  Ming Liu,et al.  Assimilating Remote Sensing Phenological Information into the WOFOST Model for Rice Growth Simulation , 2019, Remote. Sens..

[30]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[31]  Yanjun Shen,et al.  Evaluation of the FAO AquaCrop model for winter wheat on the North China Plain under deficit irrigation from field experiment to regional yield simulation. Agric Water Manag , 2014 .

[32]  Gang He,et al.  Soil water storage and winter wheat productivity affected by soil surface management and precipitation in dryland of the Loess Plateau, China , 2016 .

[33]  Josef Eitzinger,et al.  A simulation study of the effect of soil water balance and water stress on winter wheat production under different climate change scenarios , 2003 .

[34]  Shaowen Wang,et al.  A high-performance and in-season classification system of field-level crop types using time-series Landsat data and a machine learning approach , 2018, Remote Sensing of Environment.

[35]  Rick L. Lawrence,et al.  Wheat yield estimates using multi-temporal NDVI satellite imagery , 2002 .

[36]  J. Hatfield,et al.  Remote sensing estimators of potential and actual crop yield , 1983 .

[37]  Pramod K. Varshney,et al.  Decision tree regression for soft classification of remote sensing data , 2005 .

[38]  David M. Johnson An assessment of pre- and within-season remotely sensed variables for forecasting corn and soybean yields in the United States , 2014 .

[39]  A. Huete,et al.  Overview of the radiometric and biophysical performance of the MODIS vegetation indices , 2002 .

[40]  Senthold Asseng,et al.  Integrating satellite and climate data to predict wheat yield in Australia using machine learning approaches , 2018, Agricultural and Forest Meteorology.

[41]  Gregory O. Edmeades,et al.  The effect of planting date on maize: Phenology, thermal time durations and growth rates in a cool temperate climate , 2013 .

[42]  D. Lobell,et al.  On the use of statistical models to predict crop yield responses to climate change , 2010 .

[43]  B. Ji,et al.  Artificial neural networks for rice yield prediction in mountainous regions , 2007, The Journal of Agricultural Science.

[44]  J. Porter,et al.  Temperatures and the growth and development of wheat: a review , 1999 .

[45]  Kemal Polat,et al.  A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems , 2009, Expert Syst. Appl..

[46]  Ahmad Al Bitar,et al.  Estimating maize biomass and yield over large areas using high spatial and temporal resolution Sentinel-2 like remote sensing data , 2016 .

[47]  Jiyuan Liu,et al.  Climate-crop yield relationships at provincial scales in China and the impacts of recent climate trends , 2008 .

[48]  Tim Appelhans,et al.  Evaluating machine learning approaches for the interpolation of monthly air temperature at Mt. Kilimanjaro, Tanzania , 2015 .

[49]  D. Lobell,et al.  Assessing the heterogeneity and persistence of farmers’ maize yield performance across the North China Plain , 2017 .

[50]  David B. Lobell,et al.  The use of satellite data for crop yield gap analysis , 2013 .

[51]  G. Slafer,et al.  Developmental Base Temperature in Different Phenological Phases of Wheat (Triticum aestivum) , 1991 .

[52]  T. Sakamoto,et al.  A crop phenology detection method using time-series MODIS data , 2005 .

[53]  François Jonard,et al.  Characterization of Crop Canopies and Water Stress Related Phenomena using Microwave Remote Sensing Methods: A Review , 2012 .

[54]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[55]  B. Whelan,et al.  An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine learning , 2019, Precision Agriculture.

[56]  David B. Lobell,et al.  Smallholder maize area and yield mapping at national scales with Google Earth Engine , 2019, Remote Sensing of Environment.

[57]  A. Strahler,et al.  Monitoring vegetation phenology using MODIS , 2003 .

[58]  Soteris A. Kalogirou,et al.  Machine learning methods for solar radiation forecasting: A review , 2017 .

[59]  Nari Kim,et al.  Machine Learning Approaches to Corn Yield Estimation Using Satellite Images and Climate Data :A Case of Iowa State , 2016 .

[60]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[61]  Andrew Davidson,et al.  Assessing the Performance of MODIS NDVI and EVI for Seasonal Crop Yield Forecasting at the Ecodistrict Scale , 2014, Remote. Sens..

[62]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[63]  Senthold Asseng,et al.  An overview of APSIM, a model designed for farming systems simulation , 2003 .

[64]  F. Tao,et al.  Spatio-temporal patterns of winter wheat yield potential and yield gap during the past three decades in North China , 2017 .

[65]  S. Siebert,et al.  Future crop production threatened by extreme heat , 2014 .

[66]  F. Shabani,et al.  Soil salinity and matric potential interaction on water use, water use efficiency and yield response factor of bean and wheat , 2018, Scientific Reports.

[67]  A. Crane-Droesch Machine learning methods for crop yield prediction and climate change impact assessment in agriculture , 2018, Environmental Research Letters.

[68]  David B. Lobell,et al.  Mapping Smallholder Yield Heterogeneity at Multiple Scales in Eastern Africa , 2017, Remote. Sens..

[69]  Huimin Wang,et al.  Analysis of NDVI Data for Crop Identification and Yield Estimation , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[70]  Bin Wang,et al.  Impacts of rainfall extremes on wheat yield in semi-arid cropping systems in eastern Australia , 2018, Climatic Change.

[71]  Zhao Zhang,et al.  Impacts of heat stress on leaf area index and growth duration of winter wheat in the North China Plain , 2017, Field Crops Research.

[72]  W. Cao,et al.  Leaf senescence and grain filling affected by post-anthesis high temperatures in two different wheat cultivars , 2007, Plant Growth Regulation.

[73]  J. I. Ortiz-Monasterio,et al.  Extreme heat effects on wheat senescence in India , 2012 .

[74]  Inbal Becker-Reshef,et al.  Forecasting wheat yield from weather data and MODIS NDVI using Random Forests for Punjab province, Pakistan , 2017 .

[75]  G. Edmeades,et al.  The effect of planting date on maize grain yields and yield components , 2013 .

[76]  Jesse B. Tack,et al.  Effect of warming temperatures on US wheat yields , 2015, Proceedings of the National Academy of Sciences.

[77]  J. Im,et al.  Meteorological drought forecasting for ungauged areas based on machine learning: Using long-range climate forecast and remote sensing data , 2017 .

[78]  K. Nicholas,et al.  How climate change affects extremes in maize and wheat yield in two cropping regions , 2015 .

[79]  Jonathan P. Resop,et al.  Random Forests for Global and Regional Crop Yield Predictions , 2016, PloS one.

[80]  Zhao Zhang,et al.  Response of crop yields to climate trends since 1980 in China , 2012 .

[81]  C. Grant,et al.  Factors Influencing Wheat Yield and Variability: Evidence from Manitoba, Canada , 2009, Journal of Agricultural and Applied Economics.

[82]  J. Abatzoglou,et al.  TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958–2015 , 2018, Scientific Data.

[83]  Huan Xu,et al.  Support vector machine-based open crop model (SBOCM): Case of rice production in China , 2017, Saudi journal of biological sciences.

[84]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[85]  Huajun Tang,et al.  Regional yield estimation for winter wheat with MODIS-NDVI data in Shandong, China , 2008, Int. J. Appl. Earth Obs. Geoinformation.

[86]  L. Condron,et al.  Biochar and fertiliser applications influence phosphorus fractionation and wheat yield , 2013, Biology and Fertility of Soils.

[87]  S. Vincenzi,et al.  Application of a Random Forest algorithm to predict spatial distribution of the potential yield of Ruditapes philippinarum in the Venice lagoon, Italy , 2011 .

[88]  Zi Huang,et al.  Evaluation of machine learning interpolation techniques for prediction of physical properties , 2015 .

[89]  Nguyen Hieu Trung,et al.  A comparative analysis of multitemporal MODIS EVI and NDVI data for large-scale rice yield estimation , 2014 .

[90]  Zhao Zhang,et al.  Improving regional winter wheat yield estimation through assimilation of phenology and leaf area index from remote sensing data , 2018, European Journal of Agronomy.

[91]  Shusen Wang,et al.  Crop yield forecasting on the Canadian Prairies using MODIS NDVI data , 2011 .

[92]  D. Lobell,et al.  Climate Trends and Global Crop Production Since 1980 , 2011, Science.

[93]  Qiang Zhang,et al.  Impact of recent climatic change on the yield of winter wheat at low and high altitudes in semi-arid northwestern China , 2008 .

[94]  Alex J. Cannon,et al.  Crop yield forecasting on the Canadian Prairies by remotely sensed vegetation indices and machine learning methods , 2016 .

[95]  R. Tiwari,et al.  Grain filling duration and temperature pattern influence on the performance of wheat genotypes under late planting , 2013 .

[96]  Zoltán Barcza,et al.  Statistical modelling of crop yield in Central Europe using climate data and remote sensing vegetation indices , 2018, Agricultural and Forest Meteorology.

[97]  Zhao Zhang,et al.  Wheat yield benefited from increases in minimum temperature in the Huang-Huai-Hai Plain of China in the past three decades , 2017 .

[98]  Suha Berberoglu,et al.  Crop yield prediction under soil salinity using satellite derived vegetation indices , 2016 .

[99]  Douglas K. Bolton,et al.  Forecasting crop yield using remotely sensed vegetation indices and crop phenology metrics , 2013 .

[100]  Nigel G Halford,et al.  Food security: the challenge of increasing wheat yield and the importance of not compromising food safety , 2014, The Annals of applied biology.

[101]  Takeshi Motohka,et al.  Applicability of Green-Red Vegetation Index for Remote Sensing of Vegetation Phenology , 2010, Remote. Sens..

[102]  Michael Dixon,et al.  Google Earth Engine: Planetary-scale geospatial analysis for everyone , 2017 .

[103]  Onisimo Mutanga,et al.  Google Earth Engine Applications , 2019, Remote. Sens..

[104]  Jiyuan Liu,et al.  Modelling the impacts of weather and climate variability on crop productivity over a large area: A new super-ensemble-based probabilistic projection , 2009 .

[105]  Yu Wang,et al.  Mapping winter wheat using phenological feature of peak before winter on the North China Plain based on time-series MODIS data , 2017 .