Estimating wheat yields in Australia using climate records, satellite image time series and machine learning methods

Abstract Closing the yield gap between actual and potential wheat yields in Australia is important to meet the growing global demand for food. The identification of hotspots of the yield gap, where the potential for improvement is the greatest, is a necessary step towards this goal. While crop growth models are well suited to quantify potential yields, they lack the ability to provide accurate large-scale estimates of actual yields, owing to the sheer quantity of data they require for parameterisation. In this context, we sought to provide accurate estimates of actual wheat yields across the Australian wheat belt based on machine-learning regression methods, climate records and satellite image time series. Out of nine base learners and two ensembles, support vector regression with radial basis function emerged as the single best learner (root mean square error of 0.55 t ha−1 and R2 of 0.77 at the pixel level). At national scale, this model explained 73% of the yield variability observed across statistical units. Benchmark approaches based on peak Normalised Difference Vegetation Index (NDVI) and on a harvest index were largely outperformed by the machine-learning regression models (R2

[1]  François Waldner,et al.  High temporal resolution of leaf area data improves empirical estimation of grain yield , 2019, Scientific Reports.

[2]  Pierre Defourny,et al.  Local adjustments of image spatial resolution to optimize large-area mapping in the era of big data , 2018, Int. J. Appl. Earth Obs. Geoinformation.

[3]  Bernardo Rudorff,et al.  Monitoring biennial bearing effect on coffee yield using modis remote sensing imagery , 2012, 2012 IEEE International Geoscience and Remote Sensing Symposium.

[4]  D. Gobbett,et al.  Towards a national, remote-sensing-based model for predicting field-scale crop yield , 2018, Field Crops Research.

[5]  Martha C. Anderson,et al.  The shared and unique values of optical, fluorescence, thermal and microwave satellite data for estimating large-scale crop yields. , 2016 .

[6]  R. López‐Lozano,et al.  Exploiting the multi-angularity of the MODIS temporal signal to identify spatially homogeneous vegetation cover: a demonstration for agricultural monitoring applications , 2015 .

[7]  S. Prasher,et al.  Artificial neural networks to predict corn yield from Compact Airborne Spectrographic Imager data , 2005 .

[8]  Debahuti Mishra,et al.  Applications of Machine Learning Techniques in Agricultural Crop Production: A Review Paper , 2016 .

[9]  Thomas J. Jackson,et al.  Crop condition and yield simulations using Landsat and MODIS , 2004 .

[10]  S. Robinson,et al.  Food Security: The Challenge of Feeding 9 Billion People , 2010, Science.

[11]  Johannes R. Sveinsson,et al.  Random Forests for land cover classification , 2006, Pattern Recognit. Lett..

[12]  Zvi Hochman,et al.  Commercially available wheat cultivars are broadly adapted to location and time of sowing in Australia’s grain zone , 2016 .

[13]  John Yen,et al.  Introduction , 2004, CACM.

[14]  Christopher K. I. Williams Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[15]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[16]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[17]  Michael L. Roderick,et al.  Estimating the diffuse component from daily and monthly measurements of global radiation , 1999 .

[18]  Olena Dubovyk,et al.  A rule-based approach for crop identification using multi-temporal and multi-sensor phenological metrics , 2018 .

[19]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[20]  Gustau Camps-Valls,et al.  Hyperspectral dimensionality reduction for biophysical variable statistical retrieval , 2017 .

[21]  Wenping Yuan,et al.  Estimating crop yield using a satellite-based light use efficiency model , 2016 .

[22]  Per Jönsson,et al.  Performance of Smoothing Methods for Reconstructing NDVI Time-Series and Estimating Vegetation Phenology from MODIS Data , 2017, Remote. Sens..

[23]  Roland Geerken,et al.  An algorithm to classify and monitor seasonal variations in vegetation phenologies and their inter-annual change , 2009 .

[24]  M. J. Pringle,et al.  An empirical model for prediction of wheat yield, using time-integrated Landsat NDVI , 2018, Int. J. Appl. Earth Obs. Geoinformation.

[25]  Jeff Baldock,et al.  Variability in harvest index of grain crops and potential significance for carbon accounting: examples from Australian agriculture , 2010 .

[26]  Rick L. Lawrence,et al.  Wheat yield estimates using multi-temporal NDVI satellite imagery , 2002 .

[27]  T. Carter,et al.  Future scenarios of European agricultural land use: II. Projecting changes in cropland and grassland , 2005 .

[28]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[29]  J. Wolf,et al.  Yield gap analysis with local to global relevance—A review , 2013 .

[30]  Zvi Hochman,et al.  Causes of wheat yield gaps and opportunities to advance the water-limited yield frontier in Australia , 2018, Field Crops Research.

[31]  Jianyun Zhao,et al.  An Adaptive Noise Reduction Method for NDVI Time Series Data Based on S–G Filtering and Wavelet Analysis , 2018, Journal of the Indian Society of Remote Sensing.

[32]  D. Holzworth,et al.  Re-inventing model-based decision support with Australian dryland farmers. 4. Yield Prophet® helps farmers monitor and manage crops in a variable climate. , 2009 .

[33]  Luis Guanter,et al.  Estimating Crop Primary Productivity with Sentinel-2 and Landsat 8 using Machine Learning Methods Trained with Radiative Transfer Simulations , 2019, Remote Sensing of Environment.

[34]  J. Kirkegaard,et al.  Water and temperature stress define the optimal flowering period for wheat in south-eastern Australia , 2017 .

[35]  John O. Carter,et al.  Using spatial interpolation to construct a comprehensive archive of Australian climate data , 2001, Environ. Model. Softw..

[36]  Jacinto F. Fabiosa,et al.  Use of U.S. Croplands for Biofuels Increases Greenhouse Gases Through Emissions from Land-Use Change , 2008, Science.

[37]  G. Hoogenboom,et al.  Integration of MODIS LAI and vegetation index products with the CSM–CERES–Maize model for corn yield estimation , 2011 .

[38]  Claire Marais-Sicre,et al.  Land Cover and Crop Type Classification along the Season Based on Biophysical Variables Retrieved from Multi-Sensor High-Resolution Time Series , 2015, Remote. Sens..

[39]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[40]  Lutz Plümer,et al.  A review of advanced machine learning methods for the detection of biotic stress in precision crop protection , 2014, Precision Agriculture.

[41]  Jan G. P. W. Clevers,et al.  Optical remote sensing and the retrieval of terrestrial vegetation bio-geophysical properties - A review , 2015 .

[42]  B. Phalan What Have We Learned from the Land Sparing-sharing Model? , 2018, Sustainability.

[43]  Dong Jiang,et al.  An artificial neural network model for estimating crop yields using remotely sensed information , 2004 .

[44]  P. Beck,et al.  Improved monitoring of vegetation dynamics at very high latitudes: A new method using MODIS NDVI , 2006 .

[45]  A. Huete,et al.  A 30-m landsat-derived cropland extent product of Australia and China using random forest machine learning algorithm on Google Earth Engine cloud computing platform , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[46]  Tim R. McVicar,et al.  Evaluation of the remote-sensing-based DIFFUSE model for estimating photosynthesis of vegetation , 2014 .

[47]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[48]  Chris Murphy,et al.  APSIM - Evolution towards a new generation of agricultural systems simulation , 2014, Environ. Model. Softw..

[49]  Peter J. Gregory,et al.  PERFORMANCE OF THE APSIM-WHEAT MODEL IN WESTERN AUSTRALIA , 1998 .

[50]  Matthew F. McCabe,et al.  A hybrid training approach for leaf area index estimation via Cubist and random forests machine-learning , 2018 .

[51]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[52]  Olena Dubovyk,et al.  Regional-scale monitoring of cropland intensity and productivity with multi-source satellite image time series , 2018 .

[53]  Ahmad Al Bitar,et al.  Estimating maize biomass and yield over large areas using high spatial and temporal resolution Sentinel-2 like remote sensing data , 2016 .

[54]  Gustavo Camps-Valls,et al.  Gaussian processes uncertainty estimates in experimental Sentinel-2 LAI and leaf chlorophyll content retrieval , 2013 .

[55]  Patrick Bogaert,et al.  Maize Leaf Area Index Retrieval from Synthetic Quad Pol SAR Time Series Using the Water Cloud Model , 2015, Remote. Sens..

[56]  Alex J. Cannon,et al.  Crop yield forecasting on the Canadian Prairies by remotely sensed vegetation indices and machine learning methods , 2016 .

[57]  Ronald E. McRoberts,et al.  Harmonic regression of Landsat time series for modeling attributes from national forest inventory data , 2018 .

[58]  S. Chapman,et al.  Variation for and relationships among biomass and grain yield component traits conferring improved yield and grain weight in an elite wheat population grown in variable yield environments , 2009 .

[59]  Nataliia Kussul,et al.  Winter wheat yield forecasting in Ukraine based on Earth observation, meteorological data and biophysical models , 2013, Int. J. Appl. Earth Obs. Geoinformation.

[60]  J. Friedman Multivariate adaptive regression splines , 1990 .

[61]  Lizhi Wang,et al.  Crop Yield Prediction Using Deep Neural Networks , 2019, Front. Plant Sci..

[62]  David M. Johnson,et al.  A comprehensive assessment of the correlations between field crop yields and commonly used MODIS products , 2016, Int. J. Appl. Earth Obs. Geoinformation.

[63]  Z. Hochman,et al.  Climate trends account for stalled wheat yields in Australia since 1990 , 2017, Global change biology.

[64]  S. Polasky,et al.  Projecting Global Land-Use Change and Its Effect on Ecosystem Service Provision and Biodiversity with Simple Models , 2010, PloS one.

[65]  David B. Lobell,et al.  Remote sensing of regional crop production in the Yaqui Valley, Mexico: estimates and uncertainties , 2003 .

[66]  F. Baret,et al.  Crop specific green area index retrieval from MODIS data at regional scale by controlling pixel-target adequacy , 2011 .

[67]  Qihao Weng,et al.  A survey of image classification methods and techniques for improving classification performance , 2007 .

[68]  D. L. Williams,et al.  Wheat Production Estimates Using Satellite Images1 , 1975 .

[69]  J. Monteith SOLAR RADIATION AND PRODUCTIVITY IN TROPICAL ECOSYSTEMS , 1972 .

[70]  David B. Lobell,et al.  The use of satellite data for crop yield gap analysis , 2013 .

[71]  M. Trnka,et al.  Simulation of winter wheat yield and its variability in different climates of Europe: A comparison of eight crop growth models , 2011 .

[72]  C. D. Bella,et al.  Relationship between MODIS-NDVI data and wheat yield: A case study in Northern Buenos Aires province, Argentina , 2015 .

[73]  Roberto Benedetti,et al.  On the use of NDVI profiles as a tool for agricultural statistics: The case study of wheat yield estimate and forecast in Emilia Romagna , 1993 .

[74]  Yang Shao,et al.  An evaluation of time-series smoothing algorithms for land-cover classifications using MODIS-NDVI multi-temporal data , 2016 .

[75]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[76]  Ryosuke Shibasaki,et al.  ESTIMATING CORN YIELD IN THE UNITED STATES WITH MODIS EVI AND MACHINE LEARNING METHODS , 2016 .

[77]  N. Ramankutty,et al.  Closing yield gaps through nutrient and water management , 2012, Nature.

[78]  A L Hammond,et al.  Crop forecasting from space: toward a global food watch. , 1975, Science.

[79]  R. French,et al.  Water use efficiency of wheat in a Mediterranean-type environment. I. The relation between yield, water use and climate , 1984 .

[80]  O. Marinoni,et al.  Quantifying yield gaps in rainfed cropping systems: A case study of wheat in Australia , 2012 .

[81]  Bettina Baruth,et al.  Enhanced Processing of 1-km Spatial Resolution fAPAR Time Series for Sugarcane Yield Forecasting and Monitoring , 2013, Remote. Sens..

[82]  David Gobbett,et al.  Data rich yield gap analysis of wheat in Australia , 2016 .

[83]  Gérard Dedieu,et al.  Assessment of an Operational System for Crop Type Map Production Using High Temporal and Spatial Resolution Satellite Optical Imagery , 2015, Remote. Sens..

[84]  Mohsen Azadbakht,et al.  Machine Learning Regression Techniques for the Silage Maize Yield Prediction Using Time-Series Images of Landsat 8 OLI , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[85]  Tim R. McVicar,et al.  Assessing the ability of potential evaporation formulations to capture the dynamics in evaporative demand within a changing climate , 2010 .

[86]  Fionn Murtagh,et al.  Multilayer perceptrons for classification and regression , 1991, Neurocomputing.