AIEADA 1.0: Efficient high-dimensional variational data assimilation with machine-learned reduced-order models

Data assimilation (DA) in the geophysical sciences remains the cornerstone of robust forecasts from numerical models. Indeed, DA plays a crucial role in the quality of numerical weather prediction, and is a crucial building block that has allowed dramatic improvements in weather forecasting over the past few decades. DA is commonly framed in a variational setting, where one solves an optimization problem within a Bayesian formulation using raw model forecasts as a prior, and observations as likelihood. This leads to a DA objective function that needs to be minimized, where the decision variables are the initial conditions specified to the model. In traditional DA, the forward model is numerically and computationally expensive. Here we replace the forward model with a low-dimensional, data-driven, and differentiable emulator. Consequently, gradients of our DA objective function with respect to the decision variables are obtained rapidly via automatic differentiation. We demonstrate our approach by performing an emulator-assisted DA forecast of geopotential height. Our results indicate that emulator-assisted DA is faster than traditional equation-based DA forecasts by four orders of magnitude, allowing computations to be performed on a workstation rather than a dedicated high-performance computer. In addition, we describe accuracy benefits of emulator-assisted DA when compared to simply using the emulator for forecasting (i.e., without DA).

[1]  O. San,et al.  Data assimilation empowered neural network parametrizations for subgrid processes in geophysical flows , 2020, 2006.08901.

[2]  Leonard A. Smith,et al.  Nonlinear Processes in Geophysics Model Error in Weather Forecasting , 2022 .

[3]  A. Chatterjee An introduction to the proper orthogonal decomposition , 2000 .

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Adrian Sandu,et al.  A Posteriori Error Estimates for the Solution of Variational Inverse Problems , 2015, SIAM/ASA J. Uncertain. Quantification.

[6]  Nicolas Thome,et al.  Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ronald M. Errico,et al.  An examination of the accuracy of the linearization of a mesoscale model with moist physics , 1999 .

[8]  Gianmarco Mengaldo,et al.  PySPOD: A Python package for Spectral Proper Orthogonal Decomposition (SPOD) , 2021, J. Open Source Softw..

[9]  Soukayna Mouatadid,et al.  WeatherBench: A Benchmark Data Set for Data‐Driven Weather Forecasting , 2020, Journal of Advances in Modeling Earth Systems.

[10]  Hsin-Yi Lin,et al.  Integrating Recurrent Neural Networks With Data Assimilation for Scalable Data‐Driven State Estimation , 2021, Journal of Advances in Modeling Earth Systems.

[11]  I. M. Navon,et al.  Different approaches to model error formulation in 4D-Var: a study with high-resolution advection schemes , 2009 .

[12]  Renate Hagedorn,et al.  Representing model uncertainty in weather and climate prediction , 2005 .

[13]  Omer San,et al.  Long short-term memory embedded nudging schemes for nonlinear data assimilation of geophysical flows , 2020 .

[14]  Ronald M. Errico,et al.  Examination of the accuracy of a tangent linear model , 1993 .

[15]  F. L. Dimet,et al.  Variational algorithms for analysis and assimilation of meteorological observations: theoretical aspects , 1986 .

[16]  Andrew C. Lorenc,et al.  Why does 4D‐Var beat 3D‐Var? , 2005 .

[17]  Veerabhadra R. Kotamarthi,et al.  Downscaling with a nested regional climate model in near‐surface fields over the contiguous United States , 2014 .

[18]  Christopher K. Wikle,et al.  Atmospheric Modeling, Data Assimilation, and Predictability , 2005, Technometrics.

[19]  Roberto Buizza,et al.  Representing model error in ensemble data assimilation , 2014 .

[20]  Paul T. Boggs,et al.  Sequential Quadratic Programming , 1995, Acta Numerica.

[21]  Prasanna Balaprakash,et al.  Recurrent Neural Network Architecture Search for Geophysical Emulation , 2020, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.

[22]  A. Mohan,et al.  A Deep Learning based Approach to Reduced Order Modeling for Turbulent Flow Control using LSTM Neural Networks , 2018, 1804.09269.

[23]  R. Daley Atmospheric Data Analysis , 1991 .

[24]  M. Buehner Ensemble‐derived stationary and flow‐dependent background‐error covariances: Evaluation in a quasi‐operational NWP setting , 2005 .

[25]  A. Geer,et al.  Building Tangent‐Linear and Adjoint Models for Data Assimilation With Neural Networks , 2021 .

[26]  B. R. Noack Turbulence, Coherent Structures, Dynamical Systems and Symmetry , 2013 .

[27]  Marc Bocquet,et al.  Combining data assimilation and machine learning to infer unresolved scale parametrization , 2020, Philosophical Transactions of the Royal Society A.

[28]  Nils Gustafsson,et al.  Survey of data assimilation methods for convective‐scale numerical weather prediction at operational centres , 2018 .

[29]  U. Ehret,et al.  Quantitative precipitation estimation based on high-resolution numerical weather prediction and data assimilation with WRF – a performance test , 2015 .

[30]  R. Maulik,et al.  Reduced-order modeling of advection-dominated systems with recurrent neural networks and convolutional autoencoders , 2020, 2002.00470.

[31]  Emil M. Constantinescu,et al.  Predicting air quality: Improvements through advanced methods to integrate models and measurements , 2008, J. Comput. Phys..

[32]  Yike Guo,et al.  A Reduced Order Deep Data Assimilation model , 2020 .

[33]  Y. Trémolet Model‐error estimation in 4D‐Var , 2007 .

[34]  Adrian Sandu,et al.  Chemical Data Assimilation—An Overview , 2011 .

[35]  Michael I. Jordan,et al.  Ray: A Distributed Framework for Emerging AI Applications , 2017, OSDI.

[36]  Gianmarco Mengaldo,et al.  PyParSVD: A streaming, distributed and randomized singular-value-decomposition library , 2021, 2021 7th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-7).

[37]  P. Holmes,et al.  The Proper Orthogonal Decomposition in the Analysis of Turbulent Flows , 1993 .

[38]  Peter Lynch,et al.  The origins of computer weather prediction and climate modeling , 2008, J. Comput. Phys..

[39]  Dusanka Zupanski,et al.  Model Error Estimation Employing an Ensemble Data Assimilation Approach , 2006 .

[40]  Y. Trémolet Accounting for an imperfect model in 4D‐Var , 2006 .

[41]  Adrian Sandu,et al.  Multifidelity Ensemble Kalman Filtering using surrogate models defined by Physics-Informed Autoencoders , 2021, ArXiv.

[42]  G. Balsamo,et al.  Spectral Empirical Orthogonal Function Analysis of Weather and Climate Data , 2019, Monthly Weather Review.

[43]  James Glimm,et al.  Sources of uncertainty and error in the simulation of flow in porous media , 2004 .

[44]  Rossella Arcucci,et al.  Attention-based Convolutional Autoencoders for 3D-Variational Data Assimilation , 2020, Computer Methods in Applied Mechanics and Engineering.

[45]  S. Pawar,et al.  A deep learning enabler for nonintrusive reduced order modeling of fluid flows , 2019, Physics of Fluids.

[46]  Gianluigi Rozza,et al.  Neural-network learning of SPOD latent dynamics , 2021, ArXiv.

[47]  James A. Hansen Accounting for Model Error in Ensemble-Based State Estimation and Forecasting , 2002 .

[48]  Stephan Rasp,et al.  Data‐Driven Medium‐Range Weather Prediction With a Resnet Pretrained on Climate Simulations: A New Model for WeatherBench , 2020 .

[49]  Adrian Sandu,et al.  Adjoint sensitivity analysis of regional air quality models , 2005 .

[50]  Marc Bocquet,et al.  Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 model , 2019, J. Comput. Sci..

[51]  R. Errico What is an adjoint model , 1997 .

[52]  Daniel Cremers,et al.  Variational Data Assimilation with a Learned Inverse Observation Operator , 2021, ICML.