Geostatistical inverse modeling with very large datasets: an example from the Orbiting Carbon Observatory 2 (OCO-2) satellite

Abstract. Geostatistical inverse modeling (GIM) has become a common approach to estimating greenhouse gas fluxes at the Earth's surface using atmospheric observations. GIMs are unique relative to other commonly used approaches because they do not require a single emissions inventory or a bottom–up model to serve as an initial guess of the fluxes. Instead, a modeler can incorporate a wide range of environmental, economic, and/or land use data to estimate the fluxes. Traditionally, GIMs have been paired with in situ observations that number in the thousands or tens of thousands. However, the number of available atmospheric greenhouse gas observations has been increasing enormously as the number of satellites, airborne measurement campaigns, and in situ monitoring stations continues to increase. This era of prolific greenhouse gas observations presents computational and statistical challenges for inverse modeling frameworks that have traditionally been paired with a limited number of in situ monitoring sites. In this article, we discuss the challenges of estimating greenhouse gas fluxes using large atmospheric datasets with a particular focus on GIMs. We subsequently discuss several strategies for estimating the fluxes and quantifying uncertainties, strategies that are adapted from hydrology, applied math, or other academic fields and are compatible with a wide variety of atmospheric models. We further evaluate the accuracy and computational burden of each strategy using a synthetic CO2 case study based upon NASA's Orbiting Carbon Observatory 2 (OCO-2) satellite. Specifically, we simultaneously estimate a full year of 3-hourly CO2 fluxes across North America in one case study – a total of 9.4×106 unknown fluxes using 9.9×104 synthetic observations. The strategies discussed here provide accurate estimates of CO2 fluxes that are comparable to fluxes calculated directly or analytically. We are also able to approximate posterior uncertainties in the fluxes, but these approximations are, typically, an over- or underestimate depending upon the strategy employed and the degree of approximation required to make the calculations manageable.

[1]  Multivariate Geostatistics , 2004 .

[2]  Eric Darve,et al.  Large-scale stochastic linear inversion using hierarchical matrices , 2013, Computational Geosciences.

[3]  François-Marie Bréon,et al.  Contribution of the Orbiting Carbon Observatory to the estimation of CO2 sources and sinks: Theoretical study in a variational data assimilation framework , 2007 .

[4]  Kevin W. Bowman,et al.  Improved analysis‐error covariance matrix for high‐dimensional variational inversions: application to source estimation using a 3D atmospheric transport model , 2015 .

[5]  Anna M. Michalak,et al.  Global monthly averaged CO2 fluxes recovered using a geostatistical inverse modeling approach: 1. Results using atmospheric measurements , 2008 .

[6]  A. Michalak,et al.  Technical Note: Comparison of ensemble Kalman filter and variational approaches for CO 2 data assimilation , 2013 .

[7]  Bart G. van Bloemen Waanders,et al.  Fast Algorithms for Bayesian Uncertainty Quantification in Large-Scale Linear Inverse Problems Based on Low-Rank Partial Hessian Approximations , 2011, SIAM J. Sci. Comput..

[8]  Toshinobu Machida,et al.  Worldwide Measurements of Atmospheric CO2 and Other Trace Gas Species Using Commercial Airlines , 2008 .

[9]  J. Seinfeld,et al.  Development of the adjoint of GEOS-Chem , 2006 .

[10]  John C. Lin,et al.  A near-field tool for simulating the upstream influence of atmospheric observations: The Stochastic Time-Inverted Lagrangian Transport (STILT) model , 2003 .

[11]  W. Nowak,et al.  Application of FFT-based Algorithms for Large-Scale Universal Kriging Problems , 2009 .

[12]  P. Kitanidis,et al.  A geostatistical approach to contaminant source identification , 1997 .

[13]  E. G. Vomvoris,et al.  A geostatistical approach to the inverse problem in groundwater modeling (steady state) and one‐dimensional simulations , 1983 .

[14]  Akihiko Kuze,et al.  Toward accurate CO_2 and CH_4 observations from GOSAT , 2011 .

[15]  R. Pavlick,et al.  The OCO-3 mission; measurement objectives and expected performance based on one year of simulated data , 2018 .

[16]  Scot M. Miller,et al.  Atmospheric inverse modeling with known physical bounds: an example from trace gas emissions , 2013 .

[17]  Masakatsu Nakajima,et al.  The current status of GOSAT and the concept of GOSAT-2 , 2012, Remote Sensing.

[18]  P. Kitanidis Quasi‐Linear Geostatistical Theory for Inversing , 1995 .

[19]  Peter K. Kitanidis,et al.  Efficient methods for large‐scale linear inversion using a geostatistical approach , 2012 .

[20]  Scot M. Miller,et al.  Anthropogenic emissions of methane in the United States , 2013, Proceedings of the National Academy of Sciences.

[21]  Ted Chang,et al.  Introduction to Geostatistics: Applications in Hydrogeology , 2001, Technometrics.

[22]  John C. Lin,et al.  Coupled weather research and forecasting–stochastic time-inverted lagrangian transport (WRF–STILT) model , 2010 .

[23]  S. R. Searle,et al.  Restricted Maximum Likelihood (REML) Estimation of Variance Components in the Mixed Model , 1976 .

[24]  Analytical expressions of conditional mean, covariance, and sample functions in geostatistics , 1996 .

[25]  Vineet Yadav,et al.  Regional-scale geostatistical inverse modeling of North American CO 2 fluxes: a synthetic data study , 2009 .

[26]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[27]  Kevin Schaefer,et al.  Global monthly averaged CO2 fluxes recovered using a geostatistical inverse modeling approach: 2. Results including auxiliary environmental data , 2008 .

[28]  R. Prinn,et al.  Inversion of long-lived trace gas emissions using combined Eulerian and Lagrangian chemical transport models , 2011 .

[29]  John C. Lin,et al.  North American CO 2 exchange: inter-comparison of modeled estimates with results from a fine-scale atmospheric inversion , 2011 .

[30]  Martin Gallagher,et al.  Global-scale atmosphere monitoring by in-service aircraft – current achievements and future prospects of the European Research Infrastructure IAGOS , 2015 .

[31]  Dimitris Menemenlis,et al.  Carbon monitoring system flux estimation and attribution: impact of ACOS-GOSAT XCO2 sampling on the inference of terrestrial biospheric sources and sinks , 2014 .

[32]  Peter K. Kitanidis,et al.  Randomized algorithms for generalized Hermitian eigenvalue problems with application to computing Karhunen–Loève expansion , 2013, Numer. Linear Algebra Appl..

[33]  A. Karion,et al.  Linking emissions of fossil fuel CO2 and other anthropogenic trace gases using atmospheric 14CO2 , 2012 .

[34]  Richard Barrett,et al.  Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods , 1994, Other Titles in Applied Mathematics.

[35]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[36]  Daren Lu,et al.  First Global Carbon Dioxide Maps Produced from TanSat Measurements , 2018, Advances in Atmospheric Sciences.

[37]  Scott C. Doney,et al.  Variational data assimilation for atmospheric CO2 , 2006 .

[38]  Hans Wackernagel,et al.  Multivariate Geostatistics: An Introduction with Applications , 1996 .

[39]  Jeffrey L. Anderson,et al.  Toward reliable ensemble Kalman filter estimates of CO2 fluxes , 2012 .

[40]  Vineet Yadav,et al.  Characterizing biospheric carbon balance using CO 2 observations from the OCO-2 satellite , 2017 .

[41]  M. Saunders,et al.  Solution of Sparse Indefinite Systems of Linear Equations , 1975 .

[42]  P. Tans,et al.  A geostatistical approach to surface flux estimation of atmospheric trace gases , 2004 .

[43]  Dylan B. A. Jones,et al.  The 2015–2016 carbon cycle as seen from OCO-2 and the global in situ network , 2019, Atmospheric Chemistry and Physics.

[44]  A. Saibaba,et al.  Fast computation of uncertainty quantification measures in the geostatistical approach to solve inverse problems , 2014, 1404.1263.

[45]  Vineet Yadav,et al.  Improving computational efficiency in large linear inverse problems: an example from carbon dioxide flux estimation , 2013 .

[46]  Tiangang Cui,et al.  Optimal Low-rank Approximations of Bayesian Linear Inverse Problems , 2014, SIAM J. Sci. Comput..

[47]  Ying Sun,et al.  The Orbiting Carbon Observatory-2 early science investigations of regional carbon dioxide fluxes , 2017, Science.

[48]  J. Randerson,et al.  An atmospheric perspective on North American carbon dioxide exchange: CarbonTracker , 2007, Proceedings of the National Academy of Sciences.

[49]  Eric Darve,et al.  Fast Algorithms for Bayesian Inversion , 2013 .

[50]  Peter K. Kitanidis,et al.  Scalable subsurface inverse modeling of huge data sets with an application to tracer concentration breakthrough data from magnetic resonance imaging , 2016 .

[51]  P. Kitanidis Parameter Uncertainty in Estimation of Spatial Functions: Bayesian Analysis , 1986 .

[52]  C. Schwalm,et al.  Forests dominate the interannual variability of the North American carbon sink , 2018, Environmental Research Letters.

[53]  Eric Darve,et al.  Application of Hierarchical Matrices to Linear Inverse Problems in Geostatistics , 2012 .

[54]  Lei Hu,et al.  Enhanced North American carbon uptake associated with El Niño , 2019, Science Advances.

[55]  Clive D Rodgers,et al.  Inverse Methods for Atmospheric Sounding: Theory and Practice , 2000 .

[56]  Daniel J. Jacob,et al.  Modeling of Atmospheric Chemistry , 2017 .

[57]  Jens Mühle,et al.  Characterization of uncertainties in atmospheric trace gas inversions using hierarchical Bayesian methods , 2013 .

[58]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[59]  P. Kitanidis,et al.  Fast iterative implementation of large‐scale nonlinear geostatistical inverse modeling , 2014 .