Integrating Remote Sensing and Weather Variables for Mango Yield Prediction Using a Machine Learning Approach

Accurate pre-harvest yield forecasting of mango is essential to the industry as it supports better decision making around harvesting logistics and forward selling, thus optimizing productivity and reducing food waste. Current methods for yield forecasting such as manually counting 2–3% of the orchard can be accurate but are very time inefficient and labour intensive. More recent evaluations of technological solutions such as remote (satellite) and proximal (on ground) sensing have provided very encouraging results, but they still require infield in-season sampling for calibration, the technology comes at a significant cost, and commercial availability is limited, especially for vehicle-mounted sensors. This study presents the first evaluation of a ”time series”—based remote sensing method for yield forecasting of mango, a method that does not require infield fruit counts and utilizes freely available satellite imagery. Historic yield data from 2015 to 2022 were sourced from 51 individual orchard blocks from two farms (AH and MK) in the Northern Territory of Australia. Time series measures of the canopy reflectance properties of the blocks were obtained from Landsat 7 and 8 satellite data for the 2015–2022 growing seasons. From the imagery, the following vegetation indices (VIs) were derived: EVI, GNDVI, NDVI, and LSWI, whilst corresponding weather variables (rainfall (Prec), temperature (Tmin/Tmax), evapotranspiration (ETo), solar radiation (Rad), and vapor pressure deficit (vpd)) were also sourced from SILO data. To determine the relationships among weather and remotely sensed measures of canopy throughout the growing season and the yield achieved (at the block level and the farm level), six machine learning (ML) algorithms, namely random forest (RF), support vector regression (SVR), eXtreme gradient boosting (XGBOOST), RIDGE, LASSO and partial least square regression (PLSR), were trialed. The EVI/GNDVI and Prec/Tmin were found to be the best RS and weather predictors, respectively. The block-level combined RS/weather-based RF model for 2021 produced the best result (MAE = 2.9 t/ha), marginally better than the RS only RF model (MAE = 3.4 t/ha). The farm-level model error (FLEM) was generally lower than the block-level model error, for both the combined RS/weather-based RF model (farm = 3.7%, block (NMAE) = 33.6% for 2021) and the RS-based model (farm = 4.3%, block = 38.4% for 2021). Further testing of the RS/weather-based RF models over six additional orchards (other than AH and MK) produced errors ranging between 24% and 39% from 2016 to 2020. Although accuracies of prediction did vary at both the block level and the farm level, this preliminary study demonstrates the potential of a ”time series” RS method for predicting mango yields. The benefits to the mango industry are that it utilizes freely available imagery, requires no infield calibration, and provides predictions several months before the commercial harvest. Therefore, this outcome not only presents a more adoptable option for the industry, but also better supports automation and scalability in terms of block-, farm-, regional, and national level forecasting.

[1]  A. Robson,et al.  Data Requirements for Forecasting Tree Crop Yield - A Macadamia Case Study , 2023, SSRN Electronic Journal.

[2]  A. Robson,et al.  Potential of Time-Series Sentinel 2 Data for Monitoring Avocado Crop Phenology , 2022, Remote. Sens..

[3]  J. Dhupia,et al.  Fruit yield prediction and estimation in orchards: A state-of-the-art comprehensive review for both direct and indirect methods , 2022, Comput. Electron. Agric..

[4]  Wilfried Yves Hamilton Adoni,et al.  Agricultural Decision System based on Advanced Machine Learning Models for Yield Prediction: Case of East African Countries , 2022, Smart Agricultural Technology.

[5]  S. Ullah,et al.  Estimation and Forecasting of Rice Yield Using Phenology-Based Algorithm and Linear Regression Model on Sentinel-II Satellite Data , 2021, Agriculture.

[6]  Guijun Yang,et al.  Comparison of Machine-Learning and CASA Models for Predicting Apple Fruit Yields from Time-Series Planet Imageries , 2021, Remote. Sens..

[7]  K. Walsh,et al.  Technologies for Forecasting Tree Fruit Load and Harvest Timing—From Ground, Sky and Time , 2021, Agronomy.

[8]  Andrew J. Robson,et al.  Block-level macadamia yield forecasting using spatio-temporal datasets , 2021 .

[9]  Andrew McDonald,et al.  DATA QUALITY CONSIDERATIONS FOR PETROPHYSICAL MACHINE LEARNING MODELS , 2021 .

[10]  Walter T. Dado,et al.  A million kernels of truth: Insights into scalable satellite maize yield mapping and yield gap analysis from an extensive ground dataset in the US Corn Belt , 2021 .

[11]  A. Matese,et al.  Beyond the traditional NDVI index as a key factor to mainstream the use of UAV in precision viticulture , 2021, Scientific Reports.

[12]  Ayalew Kassahun,et al.  Crop yield prediction using machine learning: A systematic literature review , 2020, Comput. Electron. Agric..

[13]  A. Abd-Elrahman,et al.  Active thermal imaging for immature citrus fruit detection , 2020 .

[14]  B. Whelan,et al.  Mid-season empirical cotton yield forecasts at fine resolutions using large yield mapping datasets and diverse spatial covariates , 2020 .

[15]  João Valente,et al.  A Cloud-Based Environment for Generating Yield Estimation Maps From Apple Orchards Using UAV Imagery and a Deep Learning Technique , 2020, Frontiers in Plant Science.

[16]  A. Milella,et al.  Deep neural networks for grape bunch segmentation in natural images from a consumer-grade camera , 2020, Precision Agriculture.

[17]  Muhammad Moshiur Rahman,et al.  Integrating Landsat-8 and Sentinel-2 Time Series Data for Yield Prediction of Sugarcane Crops at the Block Level , 2020, Remote. Sens..

[18]  Mapping and Area Estimation of Mango Orchards of Lucknow Region by Applying Knowledge Based Decision Tree to Landsat 8 OLI Satellite Images , 2020, International Journal of Innovative Technology and Exploring Engineering.

[19]  S. Khaki,et al.  A CNN-RNN Framework for Crop Yield Prediction , 2019, Frontiers in Plant Science.

[20]  Yufang Jin,et al.  California Almond Yield Prediction at the Orchard Level With a Machine Learning Approach , 2019, Front. Plant Sci..

[21]  Nannan Zhang,et al.  Jujube yield prediction method combining Landsat 8 Vegetation Index and the phenological length , 2019, Comput. Electron. Agric..

[22]  D. Mayer,et al.  Improved crop forecasts for the Australian macadamia industry from ensemble models , 2019, Agricultural Systems.

[23]  R. Bhargavi,et al.  Performance Evaluation of Best Feature Subsets for Crop Yield Prediction Using Machine Learning Algorithms , 2019, Appl. Artif. Intell..

[24]  Senthold Asseng,et al.  Integrating satellite and climate data to predict wheat yield in Australia using machine learning approaches , 2018, Agricultural and Forest Meteorology.

[25]  Julien Sarron,et al.  Mango Yield Mapping at the Orchard Scale Based on Tree Structure and Land Cover Assessed by UAV , 2018, Remote. Sens..

[26]  Andrew Robson,et al.  Exploring the Potential of High Resolution WorldView-3 Imagery for Estimating Yield of Mango , 2018, Remote. Sens..

[27]  N. T. Anderson,et al.  Estimation of fruit load in mango orchards: tree sampling considerations and use of machine vision and satellite imagery , 2018, Precision Agriculture.

[28]  Maite Loidi,et al.  Sampling Stratification Using Aerial Imagery to Estimate Fruit Load in Peach Tree Orchards , 2018, Agriculture.

[29]  Michael Dixon,et al.  Google Earth Engine: Planetary-scale geospatial analysis for everyone , 2017 .

[30]  Andrew Robson,et al.  Using Worldview Satellite Imagery to Map Yield in Avocado (Persea americana): A Case Study in Bundaberg, Australia , 2017, Remote. Sens..

[31]  Jasmine Muir,et al.  Evaluating satellite remote sensing as a method for measuring yield variability in Avocado and Macadamia tree crops , 2017 .

[32]  Baofeng Su,et al.  Significant Remote Sensing Vegetation Indices: A Review of Developments and Applications , 2017, J. Sensors.

[33]  Jonathan P. Resop,et al.  Random Forests for Global and Regional Crop Yield Predictions , 2016, PloS one.

[34]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[35]  G. Moisen,et al.  Random forests and stochastic gradient boosting for predicting tree canopy cover: Comparing tuning processes and model performance , 2016 .

[36]  S. Mitra Mango production in the world – present situation and future prospect , 2016 .

[37]  J. Fitchett,et al.  Temperature and tree age interact to increase mango yields in the Lowveld, South Africa , 2016 .

[38]  S. Irwin,et al.  Forecast performance of WASDE price projections for U.S. corn , 2015 .

[39]  Abdullah Al Mamun,et al.  Untangling crop management and environmental influences on wheat yield variability in Bangladesh: An application of non-parametric approaches , 2015 .

[40]  Jose L. Zárate-Valdez,et al.  Light interception, leaf nitrogen and yield prediction in almonds: A case study , 2015 .

[41]  Andrew Davidson,et al.  Assessing the Performance of MODIS NDVI and EVI for Seasonal Crop Yield Forecasting at the Ecodistrict Scale , 2014, Remote. Sens..

[42]  Douglas K. Bolton,et al.  Forecasting crop yield using remotely sensed vegetation indices and crop phenology metrics , 2013 .

[43]  Dennis Jarvis,et al.  Estimation of mango crop yield using image analysis - Segmentation method , 2013 .

[44]  Steven Andrew Culpepper,et al.  R is for Revolution , 2011 .

[45]  P. S. Roy,et al.  Land Surface Water Index (LSWI) response to rainfall and NDVI using the MODIS Vegetation Index product , 2010 .

[46]  John H. Prueger,et al.  Value of Using Different Vegetative Indices to Quantify Agricultural Crop Characteristics at Different Growth Stages under Varying Management Practices , 2010, Remote. Sens..

[47]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[48]  J. Paruelo,et al.  How to evaluate models : Observed vs. predicted or predicted vs. observed? , 2008 .

[49]  A. Gitelson,et al.  Application of Spectral Remote Sensing for Agronomic Decisions , 2008 .

[50]  William Howard Beasley,et al.  Bootstrapping to test for nonzero population correlation coefficients using univariate sampling. , 2007, Psychological methods.

[51]  A. Sasao,et al.  Estimation of citrus yield from airborne hyperspectral images using a neural network model , 2006 .

[52]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[53]  A. Huete,et al.  Overview of the radiometric and biophysical performance of the MODIS vegetation indices , 2002 .

[54]  P. Good Resampling Methods , 1999, Birkhäuser Boston.

[55]  B. Gao NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space , 1996 .

[56]  R. Jackson,et al.  Multisite Analyses of Spectral-Biophysical Data for Wheat , 1992 .

[57]  T. Hodges,et al.  Using the CERES-Maize model to estimate production for the U.S. Cornbelt , 1987 .

[58]  Li Eckart,et al.  A brief comparative study of the potentialities and limitations of machine-learning algorithms and statistical techniques , 2021, E3S Web of Conferences.

[59]  Avadesh Meduri,et al.  MangoNet: A deep semantic segmentation architecture for a method to detect and count mangoes in an open orchard , 2019, Eng. Appl. Artif. Intell..

[60]  Bjørn-Helge Mevik,et al.  Introduction to the pls Package , 2015 .

[61]  Shinji Fukuda,et al.  Random Forests modelling for the estimation of mango (Mangifera indica L. cv. Chok Anan) fruit yields under different irrigation regimes , 2013 .

[62]  L. Breiman Random Forests , 2001, Machine Learning.

[63]  Yuri A. Gritz,et al.  Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. , 2003, Journal of plant physiology.

[64]  J. A. Schell,et al.  Monitoring vegetation systems in the great plains with ERTS , 1973 .