Machine learning for accelerating process‐based computation of land biogeochemical cycles

Global change ecology nowadays embraces ever‐growing large observational datasets (big‐data) and complex mathematical models that track hundreds of ecological processes (big‐model). The rapid advancement of the big‐data‐big‐model has reached its bottleneck: high computational requirements prevent further development of models that need to be integrated over long time‐scales to simulate the distribution of ecosystems carbon and nutrient pools and fluxes. Here, we introduce a machine‐learning acceleration (MLA) tool to tackle this grand challenge. We focus on the most resource‐consuming step in terrestrial biosphere models (TBMs): the equilibration of biogeochemical cycles (spin‐up), a prerequisite that can take up to 98% of the computational time. Through three members of the ORCHIDEE TBM family part of the IPSL Earth System Model, including versions that describe the complex interactions between nitrogen, phosphorus and carbon that do not have any analytical solution for the spin‐up, we show that an unoptimized MLA reduced the computation demand by 77%–80% for global studies via interpolating the equilibrated state of biogeochemical variables for a subset of model pixels. Despite small biases in the MLA‐derived equilibrium, the resulting impact on the predicted regional carbon balance over recent decades is minor. We expect a one‐order of magnitude lower computation demand by optimizing the choices of machine learning algorithms, their settings, and balancing the trade‐off between quality of MLA predictions and need for TBM simulations for training data generation and bias reduction. Our tool is agnostic to gridded models (beyond TBMs), compatible with existing spin‐up acceleration procedures, and opens the door to a wide variety of future applications, with complex non‐linear models benefit most from the computational efficiency.

[1]  H. Verbeeck,et al.  Atmospheric phosphorus deposition amplifies carbon sinks in simulations of a tropical forest in Central Africa. , 2022, The New phytologist.

[2]  S. Zaehle,et al.  Convergence in phosphorus constraints to photosynthesis in forests around the world , 2022, Nature Communications.

[3]  C. Ottlé,et al.  Quantifying and Reducing Uncertainty in Global Carbon Cycle Predictions: Lessons and Perspectives From 15 Years of Data Assimilation Studies With the ORCHIDEE Terrestrial Biosphere Model , 2022, Global Biogeochemical Cycles.

[4]  Ying‐ping Wang,et al.  Modelling of land nutrient cycles: recent progress and future development , 2021, Faculty reviews.

[5]  K. Belitz,et al.  Evaluation of six methods for correcting bias in estimates from ensemble tree machine learning regression models , 2021, Environ. Model. Softw..

[6]  Jinfeng Chang,et al.  Global evaluation of the nutrient-enabled version of the land surface model ORCHIDEE-CNP v1.2 (r5986) , 2021 .

[7]  Atul K. Jain,et al.  Global Carbon Budget 2020 , 2020, Earth System Science Data.

[8]  R. Ferrière,et al.  A multi-scale eco-evolutionary model of cooperation reveals how microbial adaptation influences soil decomposition , 2020, Communications biology.

[9]  Rosie A. Fisher,et al.  Perspectives on the Future of Land Surface Models and the Challenges of Representing Complex Terrestrial Systems , 2020, Journal of Advances in Modeling Earth Systems.

[10]  Joachim Denzler,et al.  Deep learning and process understanding for data-driven Earth system science , 2019, Nature.

[11]  P. Ciais,et al.  Matrix‐Based Sensitivity Assessment of Soil Organic Carbon Storage: A Case Study from the ORCHIDEE‐MICT Model , 2018, Journal of advances in modeling earth systems.

[12]  H. Tian,et al.  The Global N2O Model Intercomparison Project , 2018, Bulletin of the American Meteorological Society.

[13]  P. Ciais,et al.  A representation of the phosphorus cycle for ORCHIDEE (revision 4520) , 2017 .

[14]  S. Sitch,et al.  Modeling the Terrestrial Biosphere , 2014 .

[15]  I. Prentice,et al.  Reliable, robust and realistic: the three R's of next-generation land-surface modelling , 2014 .

[16]  Yiqi Luo,et al.  A semi-analytical solution to accelerate spin-up of a coupled carbon and nitrogen land model to steady state , 2012 .

[17]  M. A. H. Farquad,et al.  Preprocessing unbalanced data using support vector machine , 2012, Decis. Support Syst..

[18]  P. Thornton,et al.  Ecosystem model spin-up: Estimating steady state conditions in a coupled terrestrial carbon and nitrogen cycle model , 2005 .

[19]  I. C. Prentice,et al.  A dynamic global vegetation model for studies of the coupled atmosphere‐biosphere system , 2005 .

[20]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.

[21]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[22]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..