OptIC project: An intercomparison of optimization techniques for parameter estimation in terrestrial biogeochemical models

We describe results of a project known as OptIC (Optimisation InterComparison) for comparison of parameter estimation methods in terrestrial biogeochemical models. A highly simplified test model was used to generate pseudo-data to which noise with different characteristics was added. Participants in the OptIC project were asked to estimate the model parameters used to generate this data, and to predict model variables into the future. Ten participants contributed results using one of the following methods: Levenberg-Marquardt, adjoint, Kalman filter, Markov chain Monte Carlo and genetic algorithm. Methods differed in how they locate the minimum (gradient-descent or global search), how observations are processed (all at once sequentially), or the number of iterations used, or assumptions about the statistics (some methods assume Gaussian probability density functions; others do not). We found the different methods equally successful at estimating the parameters in our application. The biggest variation in parameter estimates arose from the choice of cost function, not the choice of optimization method. Relatively poor results were obtained when the model-data mismatch in the cost function included weights that were instantaneously dependent on noisy observations. This was the case even when the magnitude of residuals varied with the magnitude of observations. Missing data caused estimates to be more scattered, and the uncertainty of predictions increased correspondingly. All methods gave biased results when the noise was temporally correlated or non-Gaussian, or when incorrect model forcing was used. Our results highlight the need for care in choosing the error model in any optimization.

[1]  Will Steffen,et al.  Global Change and the Earth System , 2008 .

[2]  Stephen H. Roxburgh,et al.  Assessing the carbon sequestration potential of managed forests : a case study from temperate Australia , 2006 .

[3]  M. Raupach Dynamics of resource production and utilisation in two-component biosphere-human and terrestrial carbon systems , 2006 .

[4]  S. Sorooshian,et al.  Investigating the impact of remotely sensed precipitation and hydrologic model uncertainties on the ensemble streamflow forecasting , 2006 .

[5]  M. Clark,et al.  Snow Data Assimilation via an Ensemble Kalman Filter , 2006 .

[6]  K. Davis,et al.  A multi-site analysis of random error in tower-based measurements of carbon and energy fluxes , 2006 .

[7]  W. Steffen,et al.  Global Change and the Earth System: A Planet Under Pressure , 2005 .

[8]  D. Hollinger,et al.  Statistical modeling of ecosystem respiration using eddy covariance data: Maximum likelihood parameter estimation, and Monte Carlo simulation of model and parameter uncertainty, applied to three simple models , 2005 .

[9]  W. Knorr,et al.  Inversion of terrestrial ecosystem model parameter values against eddy covariance measurements by Monte Carlo sampling , 2005 .

[10]  W. Gilks Markov Chain Monte Carlo , 2005 .

[11]  R. Giering,et al.  Two decades of terrestrial carbon fluxes from a carbon cycle data assimilation system (CCDAS) , 2005 .

[12]  Chang Wook Ahn,et al.  On the practical genetic algorithms , 2005, GECCO '05.

[13]  B. Law,et al.  An improved analysis of forest carbon dynamics using data assimilation , 2005 .

[14]  B O B B,et al.  Estimating diurnal to annual ecosystem parameters by synthesis of a carbon flux model with eddy covariance net ecosystem exchange observations , 2005 .

[15]  M. R. R A U Pa C H,et al.  Model – data synthesis in terrestrial carbon observation : methods , data requirements and data uncertainty specifications , 2005 .

[16]  Svetlana N. Losa,et al.  Weak constraint parameter estimation for a simple ocean ecosystem model: what can we learn about the model and data? , 2004 .

[17]  Christopher B. Field,et al.  The global carbon cycle: integrating humans, climate and the natural world. , 2004 .

[18]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[19]  G. Evans,et al.  Defining misfit between biogeochemical models and data sets , 2003 .

[20]  Geir Evensen,et al.  The Ensemble Kalman Filter: theoretical formulation and practical implementation , 2003 .

[21]  Damian Barrett,et al.  Steady state turnover time of carbon in the Australian terrestrial biosphere , 2002 .

[22]  D. McLaughlin,et al.  Hydrologic Data Assimilation with the Ensemble Kalman Filter , 2002 .

[23]  M. Verlaan,et al.  Nonlinearity in Data Assimilation Applications: A Practical Method for Analysis , 2001 .

[24]  Peter A. Coppin,et al.  Parameter estimation in surface exchange models using nonlinear inversion: how many parameters can we estimate and which measurements are most useful? , 2001 .

[25]  Clive D Rodgers,et al.  Inverse Methods for Atmospheric Sounding: Theory and Practice , 2000 .

[26]  D. A. Zimmerman,et al.  A comparison of seven geostatistically based inverse approaches to estimate transmissivities for modeling advective transport by groundwater flow , 1998 .

[27]  Jorge Nocedal,et al.  Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[28]  R. Aalderink,et al.  Identification of the parameters describing primary production from continuous oxygen signals. , 1997 .

[29]  R. Giering Tangent linear and adjoint model compiler users manual , 1996 .

[30]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[31]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[32]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[33]  William H. Press,et al.  Numerical recipes , 1990 .

[34]  Donald L. DeAngelis,et al.  The global carbon cycle. , 1990 .

[35]  R. Preisendorfer,et al.  Principal Component Analysis in Meteorology and Oceanography , 1988 .

[36]  I. Jolliffe Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[37]  B. Øksendal Stochastic Differential Equations , 1985 .

[38]  W. Menke Geophysical data analysis : discrete inverse theory , 1984 .

[39]  M. Raupach,et al.  Markov-chain simulation of particle dispersion in inhomogeneous flows: The mean drift velocity induced by a gradient in Eulerian velocity variance , 1982 .

[40]  D. Williams STOCHASTIC DIFFERENTIAL EQUATIONS: THEORY AND APPLICATIONS , 1976 .

[41]  A.H. Haddad,et al.  Applied optimal estimation , 1976, Proceedings of the IEEE.

[42]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[43]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.