Predicting carbon dioxide and energy fluxes across global FLUXNET sites with regression algorithms

Abstract. Spatio-temporal fields of land–atmosphere fluxes derived from data-driven models can complement simulations by process-based land surface models. While a number of strategies for empirical models with eddy-covariance flux data have been applied, a systematic intercomparison of these methods has been missing so far. In this study, we performed a cross-validation experiment for predicting carbon dioxide, latent heat, sensible heat and net radiation fluxes across different ecosystem types with 11 machine learning (ML) methods from four different classes (kernel methods, neural networks, tree methods, and regression splines). We applied two complementary setups: (1) 8-day average fluxes based on remotely sensed data and (2) daily mean fluxes based on meteorological data and a mean seasonal cycle of remotely sensed variables. The patterns of predictions from different ML and experimental setups were highly consistent. There were systematic differences in performance among the fluxes, with the following ascending order: net ecosystem exchange (R2   0.6), gross primary production (R2> 0.7), latent heat (R2 > 0.7), sensible heat (R2 > 0.7), and net radiation (R2 > 0.8). The ML methods predicted the across-site variability and the mean seasonal cycle of the observed fluxes very well (R2 > 0.7), while the 8-day deviations from the mean seasonal cycle were not well predicted (R2

[1]  Dario Papale,et al.  Eddy Covariance: A Practical Guide to Measurement and Data Analysis , 2012 .

[2]  Zhao-Liang Li,et al.  Validation of the land-surface temperature products retrieved from Terra Moderate Resolution Imaging Spectroradiometer data , 2002 .

[3]  Andrew E. Suyker,et al.  Estimation of net ecosystem carbon exchange for the conterminous United States by combining MODIS and AmeriFlux data , 2008, Agricultural and Forest Meteorology.

[4]  M. Lomas,et al.  Evaluation of terrestrial carbon cycle models for their response to climate variability and to CO2 trends , 2013, Global change biology.

[5]  K. Ichii,et al.  Comparison of the data‐driven top‐down and bottom‐up global terrestrial CO2 exchanges: GOSAT CO2 inversion and empirical eddy flux upscaling , 2015 .

[6]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[7]  J. Denzler,et al.  Large-scale Gaussian process classification using random decision forests , 2012, Pattern Recognition and Image Analysis.

[8]  B. Gao NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space , 1996 .

[9]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[10]  N. Gobron,et al.  Diagnostic assessment of European gross primary production , 2008 .

[11]  M. Heimann,et al.  Comprehensive comparison of gap-filling techniques for eddy covariance net carbon fluxes , 2007 .

[12]  P. Cox,et al.  Evaluating the Land and Ocean Components of the Global Carbon Cycle in the CMIP5 Earth System Models , 2013 .

[13]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[14]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  D. C. Uprety,et al.  Carbon dioxide , 2017, Reactions Weekly.

[16]  Nuno Carvalhais,et al.  Effect of spatial sampling from European flux towers for estimating carbon and water fluxes with artificial neural networks , 2015 .

[17]  N. C. Strugnell,et al.  First operational BRDF, albedo nadir reflectance products from MODIS , 2002 .

[18]  Jiyuan Liu,et al.  Characterization of forest types in Northeastern China, using multi-temporal SPOT-4 VEGETATION sensor data , 2002 .

[19]  D. Hollinger,et al.  Uncertainty in eddy covariance measurements and its application to physiological models. , 2005, Tree physiology.

[20]  Dario Papale,et al.  A full greenhouse gases budget of Africa: synthesis, uncertainties, and vulnerabilities , 2014 .

[21]  W. Oechel,et al.  A new model of gross primary productivity for North American ecosystems based solely on the enhanced vegetation index and land surface temperature from MODIS , 2008 .

[22]  P. J. García Nieto,et al.  Forecasting the cyanotoxins presence in fresh waters: A new model based on genetic algorithms combined with the MARS technique , 2013 .

[23]  A-Xing Zhu,et al.  Developing a continental-scale measure of gross primary production by combining MODIS and AmeriFlux data through Support Vector Machine approach , 2007 .

[24]  V. Arora,et al.  The effect of driving climate data on the simulated terrestrial carbon pools and fluxes over North America , 2014 .

[25]  Vipin Kumar,et al.  Similarity Measures for Categorical Data: A Comparative Evaluation , 2008, SDM.

[26]  D. Roy,et al.  An overview of MODIS Land data processing and product status , 2002 .

[27]  K. Davis,et al.  Global estimates of evapotranspiration and gross primary production based on MODIS and global meteorology data , 2010 .

[28]  Costanza Calzolari,et al.  Development of pedotransfer functions using a group method of data handling for the soil of the Pianura Padano-Veneta region of North Italy: water retention properties , 2005 .

[29]  Peter Troch,et al.  Observed timescales of evapotranspiration response to soil moisture , 2006 .

[30]  A. Huete,et al.  Overview of the radiometric and biophysical performance of the MODIS vegetation indices , 2002 .

[31]  Jan G. P. W. Clevers,et al.  Optical remote sensing and the retrieval of terrestrial vegetation bio-geophysical properties - A review , 2015 .

[32]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[33]  D. Baldocchi ‘Breathing’ of the terrestrial biosphere: lessons learned from a global network of carbon dioxide flux measurement systems , 2008 .

[34]  Hoshin Vijai Gupta,et al.  Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling , 2009 .

[35]  J. Nash,et al.  River flow forecasting through conceptual models part I — A discussion of principles☆ , 1970 .

[36]  Markus Reichstein,et al.  Effects of climate extremes on the terrestrial carbon cycle: concepts, processes and potential future impacts , 2015, Global change biology.

[37]  A. Arneth,et al.  Global patterns of land-atmosphere fluxes of carbon dioxide, latent heat, and sensible heat derived from eddy covariance, satellite, and meteorological observations , 2011 .

[38]  C. Priestley,et al.  On the Assessment of Surface Heat Flux and Evaporation Using Large-Scale Parameters , 1972 .

[39]  J. Paruelo,et al.  Temporal and spatial patterns of ecosystem functioning in protected arid areas in southeastern Spain , 2005 .

[40]  Markus Reichstein,et al.  Improving canopy processes in the Community Land Model version 4 (CLM4) using global flux fields empirically inferred from FLUXNET data , 2011 .

[41]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[42]  Kenneth L. Clark,et al.  Ecosystem carbon dioxide fluxes after disturbance in forests of North America , 2010 .

[43]  S. Seneviratne,et al.  Recent decline in the global land evapotranspiration trend due to limited moisture supply , 2010, Nature.

[44]  P. Reich The Carbon Dioxide Exchange , 2010, Science.

[45]  W. Oechel,et al.  A continuous measure of gross primary production for the conterminous United States derived from MODIS and AmeriFlux data , 2010, Remote Sensing of Environment.

[46]  A. Arneth,et al.  Separation of net ecosystem exchange into assimilation and respiration using a light response curve approach: critical issues and global evaluation , 2010 .

[47]  A-Xing Zhu,et al.  Prediction of Continental-Scale Evapotranspiration by Combining MODIS and AmeriFlux Data Through Support Vector Machine , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[48]  Ray Leuning,et al.  Global vegetation gross primary production estimation using satellite-derived light-use efficiency and canopy conductance. , 2015 .

[49]  Chandra Giri,et al.  A comparative analysis of the Global Land Cover 2000 and MODIS land cover data sets , 2005 .

[50]  Peter E. Thornton,et al.  Modeling and measuring the effects of disturbance history and climate on carbon and water budgets in evergreen needleleaf forests , 2002 .

[51]  A. Bondeau,et al.  Towards global empirical upscaling of FLUXNET eddy covariance observations: validation of a model tree ensemble approach using a biosphere model , 2009 .

[52]  D. Baldocchi,et al.  Measuring fluxes of trace gases and energy between ecosystems and the atmosphere – the state and future of the eddy covariance method , 2014, Global change biology.

[53]  R. Valentini,et al.  A new assessment of European forests carbon exchanges by eddy fluxes and artificial neural network spatialization , 2003 .

[54]  Lorenzo Bruzzone,et al.  Kernel methods for remote sensing data analysis , 2009 .

[55]  A. Arneth,et al.  Assimilation exceeds respiration sensitivity to drought: A FLUXNET synthesis , 2010 .

[56]  T. Vesala,et al.  Towards a standardized processing of Net Ecosystem Exchange measured with eddy covariance technique: algorithms and uncertainty estimation , 2006 .

[57]  Joachim Denzler,et al.  Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition , 2013, Machine Vision and Applications.

[58]  S. Running,et al.  Global products of vegetation leaf area and fraction absorbed PAR from year one of MODIS data , 2002 .

[59]  J. Thepaut,et al.  The ERA‐Interim reanalysis: configuration and performance of the data assimilation system , 2011 .

[60]  Martin Jung,et al.  A Guided Hybrid Genetic Algorithm for Feature Selection with Expensive Cost Functions , 2013, ICCS.

[61]  R. Nemani,et al.  Refinement of rooting depths using satellite-based evapotranspiration seasonality for ecosystem modeling in California , 2009 .

[62]  P. Ciais,et al.  How errors on meteorological variables impact simulated ecosystem fluxes: a case study for six French sites , 2011 .

[63]  Gustau Camps-Valls,et al.  Uncertainty analysis of gross primary production upscaling using Random Forests, remote sensing and eddy covariance data , 2015 .

[64]  T. A. Black,et al.  Reduction in carbon uptake during turn of the century drought in western North America , 2012 .

[65]  Reza Shirmohammadi,et al.  Optimization of mixed refrigerant systems in low temperature applications by means of group method of data handling (GMDH) , 2015 .

[66]  T. Vesala,et al.  On the separation of net ecosystem exchange into assimilation and ecosystem respiration: review and improved algorithm , 2005 .

[67]  F. Woodward,et al.  Terrestrial Gross Carbon Dioxide Uptake: Global Distribution and Covariation with Climate , 2010, Science.

[68]  Olaf Menzer,et al.  Carbon dioxide exchange over multiple temporal scales in an arid shrub ecosystem near La Paz, Baja California Sur, Mexico , 2012 .