Data science at farm level: Explaining and predicting within-farm variability in potato growth and yield

Abstract The growth and yield of crops within a farm largely vary among fields. Farms are increasing in size by acquiring smaller land parcels from different farmers who have different management strategies. As a result, between-field variability increases and understanding such variability is a necessity for precision farming. New data analysis techniques are needed in this context, especially given the trend that more farms are collecting more data. Therefore, this study has the objective to provide a data analysis methodology to analyze within-year variability and identify year-independent factors that influence growth. As a second objective, we applied this novel methodology to a case study, where we analyzed potato growth data of four successive years of a farm in the south of the Netherlands. The methodology consists of three main steps: (1) describing growth using mixed models, (2) clustering and explaining growth by correlating the clusters to (a) yield, (b) other plant characteristics and (c) to defining, limiting and reducing variables, and (3) predicting growth by automatically selecting a regression model. By applying our method on the potato growth data, we obtained the following results. The main results of the work are: (1) the estimated growth curves of the stems, haulm and tubers explain the between-field variability in growth well very well ( R 2 of 0.85 , 0.74 and 0.89 , respectively), (2) clusters with a stem length between 110 and 130 cm have the highest average yield, (3) deeper groundwater level and sugar beet or grass as previously cultivated crop positively influence growth, and (4) N and K fertilization must be adjusted for optimal growth. Concluding, this study responds to the quest for new data-based methods for sustainable intensification, and is the first to explicitly analyze and explain differences in crop growth between fields in practice. In addition, clear management advice could be provided, showing the scientific and practical potential of our methodology.

[1]  Paul C. Struik,et al.  Seed potato technology , 1999 .

[2]  C. T. de Wit,et al.  Potential photosynthesis of crop surfaces. , 1959 .

[3]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[4]  S. Zingore,et al.  Understanding variability in crop response to fertilizer and amendments in sub-Saharan Africa , 2016, Agriculture, ecosystems & environment.

[5]  P. Bremner,et al.  Growth, Development and Yield in the Potato , 1965 .

[6]  Bruno Basso,et al.  Drivers of within-field spatial and temporal variability of crop yield across the US Midwest , 2018, Scientific Reports.

[7]  I. Farran,et al.  Potato minituber production using aeroponics: Effect of plant density and harvesting intervals , 2008, American Journal of Potato Research.

[8]  R. G. Evans,et al.  Relating potato yield and quality to field scale variability in soil characteristics , 2002, American Journal of Potato Research.

[9]  Regina Nuzzo,et al.  Scientific method: Statistical errors , 2014, Nature.

[10]  J. Mclaren,et al.  The influence of leaf area, light interception and season on potato growth and yield , 1982, Potato Research.

[11]  K. Scholte,et al.  Effect of soil moisture content on the suppression of Rhizoctonia stem canker on potato by the nematode Aphelenchus avenae and the springtail Folsomia fimetaria , 1997 .

[12]  C. McCulloch An Introduction to Generalized Linear Mixed Models , 1996 .

[13]  Damaris Zurell,et al.  Collinearity: a review of methods to deal with it and a simulation study evaluating their performance , 2013 .

[14]  D. Corwin,et al.  Apparent soil electrical conductivity measurements in agriculture , 2005 .

[15]  P. Tittonell,et al.  The yield gap of major food crops in family agriculture in the tropics: Assessment and analysis through field surveys and modelling , 2013 .

[16]  Pablo Tittonell,et al.  On farm assessment of rice yield variability and productivity gaps between organic and conventional cropping systems under Mediterranean climate , 2011 .

[17]  Russell D. Wolfinger,et al.  Fitting Nonlinear Mixed Models with the New NLMIXED Procedure , 1999 .

[18]  B. Andrée,et al.  Assessing local and regional economic impacts of climatic extremes and feasibility of adaptation measures in Dutch arable farming systems , 2017 .

[19]  Nilam Ram,et al.  Nonlinear growth curves in developmental research. , 2011, Child development.

[20]  C. Loon The effect of water stress on potato growth, development, and yield , 2008, American Potato Journal.

[21]  Cort J. Willmott,et al.  GLOBAL DISTRIBUTION OF PLANT‐EXTRACTABLE WATER CAPACITY OF SOIL , 1996 .

[22]  Cort J. Willmott,et al.  Global Distribution of Plant-Extractable Water Capacity of Soil (Dunne) , 2000 .

[23]  J. Wolf,et al.  Participatory design of farm level adaptation to climate risks in an arable region in The Netherlands , 2013 .

[24]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[25]  M. Ittersum,et al.  Yield gaps in Dutch arable farming systems : Analysis at crop and crop rotation level , 2017 .

[26]  A. Jaradat,et al.  Crop Rotation and Nitrogen Input Effects on Soil Fertility, Maize Mineral Nutrition, Yield, and Seed Composition , 2009 .

[27]  Pytrik Reidsma,et al.  Scenarios of long-term farm structural change for application in climate change impact assessment , 2012, Landscape Ecology.

[28]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[29]  D. Bates,et al.  Nonlinear mixed effects models for repeated measures data. , 1990, Biometrics.

[30]  F. de Vries,et al.  Bodemkaart van Nederland 1:250.000 : beknopte beschrijving van de kaarteenheden , 1985 .

[31]  R. Gordon,et al.  Relationship between leaf area index and ground cover in potato under different management conditions , 2002, Potato Research.

[32]  N. Anten,et al.  Can big data explain yield variability and water productivity in intensive cropping systems? , 2020 .

[33]  J. Ledent,et al.  Effects of moderate drought conditions on green leaf number, stem height, leaf length and tuber yield of potato cultivars , 2001 .

[34]  F. Kanampiu,et al.  Soyabean response to rhizobium inoculation across sub-Saharan Africa: Patterns of variation and the role of promiscuity , 2017, Agriculture, ecosystems & environment.

[35]  O. Christen,et al.  Sugar beet rotation effects on soil organic matter and calculated humus balance in Central Germany , 2016 .

[36]  Judith D. Singer,et al.  Using SAS PROC MIXED to Fit Multilevel Models, Hierarchical Models, and Individual Growth Models , 1998 .

[37]  H. Chernoff,et al.  Why significant variables aren’t automatically good predictors , 2015, Proceedings of the National Academy of Sciences.

[38]  D. Watson,et al.  An analysis of the effects of nutrient supply on the growth of potato crops , 1971 .

[39]  Hailu Shiferaw,et al.  Potassium (K)-to-magnesium (Mg) ratio, its spatial variability and implications to potential Mg-induced K deficiency in Nitisols of Southern Ethiopia , 2018, Agriculture & Food Security.

[40]  K. Cassman,et al.  A global perspective on sustainable intensification research , 2020, Nature Sustainability.

[41]  Daren S. Mueller,et al.  Assessing causes of yield gaps in agricultural areas with diversity in climate and soils , 2017 .

[42]  J. Wolf,et al.  Yield gap analysis with local to global relevance—A review , 2013 .

[43]  B. Ma,et al.  Assessment of some major yield-limiting factors on maize production in a humid temperate environment , 2009 .

[44]  H. Kolbe,et al.  Development, growth and chemical composition of the potato crop (Solanum tuberosum L.). I. leaf and stem , 1997, Potato Research.

[45]  J. Goudriaan,et al.  ON APPROACHES AND APPLICATIONS OF THE WAGENINGEN CROP MODELS , 2003 .

[46]  Stephen D. Kachman,et al.  AN INTRODUCTION TO GENERALIZED LINEAR MIXED MODELS , 2001 .

[47]  G. Molenberghs,et al.  Linear Mixed Models for Longitudinal Data , 2001 .

[48]  P. Struik,et al.  Physiological age index: a new, simple and reliable index to assess the physiological age of seed potato tubers based on haulm killing date and length of the incubation period , 2001 .

[49]  Lammert Kooistra,et al.  Review of yield gap explaining factors and opportunities for alternative data collection approaches , 2017 .

[50]  Kristin M. Bakke,et al.  The perils of policy by p-value: Predicting civil conflicts , 2010 .

[51]  A. Franzluebbers,et al.  Grassland–Cropping Rotations: An Avenue for Agricultural Diversification to Reconcile High Production with Environmental Quality , 2015, Environmental Management.

[52]  Patrick J Curran,et al.  Twelve Frequently Asked Questions About Growth Curve Modeling , 2010, Journal of cognition and development : official journal of the Cognitive Development Society.