Towards improved and more routine Earth system model evaluation in CMIP

Abstract. The Coupled Model Intercomparison Project (CMIP) has successfully provided the climate community with a rich collection of simulation output from Earth system models (ESMs) that can be used to understand past climate changes and make projections and uncertainty estimates of the future. Confidence in ESMs can be gained because the models are based on physical principles and reproduce many important aspects of observed climate. More research is required to identify the processes that are most responsible for systematic biases and the magnitude and uncertainty of future projections so that more relevant performance tests can be developed. At the same time, there are many aspects of ESM evaluation that are well established and considered an essential part of systematic evaluation but have been implemented ad hoc with little community coordination. Given the diversity and complexity of ESM analysis, we argue that the CMIP community has reached a critical juncture at which many baseline aspects of model evaluation need to be performed much more efficiently and consistently. Here, we provide a perspective and viewpoint on how a more systematic, open, and rapid performance assessment of the large and diverse number of models that will participate in current and future phases of CMIP can be achieved, and announce our intention to implement such a system for CMIP6. Accomplishing this could also free up valuable resources as many scientists are frequently "re-inventing the wheel" by re-writing analysis routines for well-established analysis methods. A more systematic approach for the community would be to develop and apply evaluation tools that are based on the latest scientific knowledge and observational reference, are well suited for routine use, and provide a wide range of diagnostics and performance metrics that comprehensively characterize model behaviour as soon as the output is published to the Earth System Grid Federation (ESGF). The CMIP infrastructure enforces data standards and conventions for model output and documentation accessible via the ESGF, additionally publishing observations (obs4MIPs) and reanalyses (ana4MIPs) for model intercomparison projects using the same data structure and organization as the ESM output. This largely facilitates routine evaluation of the ESMs, but to be able to process the data automatically alongside the ESGF, the infrastructure needs to be extended with processing capabilities at the ESGF data nodes where the evaluation tools can be executed on a routine basis. Efforts are already underway to develop community-based evaluation tools, and we encourage experts to provide additional diagnostic codes that would enhance this capability for CMIP. At the same time, we encourage the community to contribute observations and reanalyses for model evaluation to the obs4MIPs and ana4MIPs archives. The intention is to produce through the ESGF a widely accepted quasi-operational evaluation framework for CMIP6 that would routinely execute a series of standardized evaluation tasks. Over time, as this capability matures, we expect to produce an increasingly systematic characterization of models which, compared with early phases of CMIP, will more quickly and openly identify the strengths and weaknesses of the simulations. This will also reveal whether long-standing model errors remain evident in newer models and will assist modelling groups in improving their models. This framework will be designed to readily incorporate updates, including new observations and additional diagnostics and metrics as they become available from the research community.

[1]  Michael Schulz,et al.  Will a perfect model agree with perfect observations? The impact of spatial sampling , 2016 .

[2]  P. Cox,et al.  Sensitivity of tropical carbon to climate change constrained by carbon dioxide variability , 2013, Nature.

[3]  A. Hall,et al.  Using the current seasonal cycle to constrain snow albedo feedback in future climate change , 2006 .

[4]  M. Holland,et al.  Constraining projections of summer Arctic sea ice , 2012 .

[5]  Axel Lauer,et al.  ESMValTool ( v 1 . 0 ) – a community diagnostic and performance metrics tool for routine evaluation of Earth system models in , 2018 .

[6]  Simon Read,et al.  ESMValTool (v1.0) – a community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP , 2015 .

[7]  S. Bony,et al.  How Well Do We Understand and Evaluate Climate Change Feedback Processes , 2006 .

[8]  Rohit Srivastava,et al.  Observational challenges in evaluating climate models , 2013 .

[9]  Andrew Gettelman,et al.  The Art and Science of Climate Model Tuning , 2017 .

[10]  P. Cox,et al.  Projected land photosynthesis constrained by changes in the seasonal cycle of atmospheric CO2 , 2016, Nature.

[11]  Veronika Eyring,et al.  Evaluation of Climate Models. In: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change , 2013 .

[12]  P. Cox,et al.  Evaluating the Land and Ocean Components of the Global Carbon Cycle in the CMIP5 Earth System Models , 2013 .

[13]  A. Abe-Ouchi,et al.  A comparison of PMIP2 model simulations and the MARGO proxy reconstruction for tropical sea surface temperatures at last glacial maximum , 2009 .

[14]  Duane E. Waliser,et al.  Satellite Observations for CMIP5: The Genesis of Obs4MIPs , 2014 .

[15]  Karl E. Taylor,et al.  An overview of CMIP5 and the experiment design , 2012 .

[16]  Charles Doutriaux,et al.  A More Powerful Reality Test for Climate Models , 2016 .

[17]  Veronika Eyring,et al.  SPARC Report on the Evaluation of Chemistry-Climate Models , 2010 .

[18]  Veronika Eyring,et al.  Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization , 2015 .

[19]  Curt Covey,et al.  Metrics for the Diurnal Cycle of Precipitation: Toward Routine Benchmarks for Climate Models , 2016 .

[20]  Charles Doutriaux,et al.  Performance metrics for climate models , 2008 .

[21]  Veronika Eyring,et al.  A community diagnostic tool for chemistry climate model validation , 2012 .

[22]  S. Klein,et al.  Emergent Constraints for Cloud Feedbacks , 2015, Current Climate Change Reports.

[23]  W. Collins,et al.  Evaluation of climate models , 2013 .

[24]  Bin Wang,et al.  The Asian summer monsoon: an intercomparison of CMIP5 vs. CMIP3 simulations of the late 20th century , 2013, Climate Dynamics.

[25]  B. Santer,et al.  Statistical significance of climate sensitivity predictors obtained by data mining , 2014 .

[26]  D. E. Harrison,et al.  Implementation Plan for the Global Observing System for Climate in Support of the UNFCCC (2010 Update) , 2010 .

[27]  V. Eyring,et al.  Quantitative performance metrics for stratospheric-resolving chemistry-climate models , 2008 .

[28]  John M. Haynes,et al.  COSP: Satellite simulation software for model assessment , 2011 .

[29]  E. Guilyardi,et al.  ENSO representation in climate models: from CMIP3 to CMIP5 , 2013, Climate Dynamics.

[30]  D. Klocke,et al.  Tuning the climate of a global model , 2012 .

[31]  P. Ciais,et al.  Europe-wide reduction in primary productivity caused by the heat and drought in 2003 , 2005, Nature.

[32]  Reto Knutti,et al.  The use of the multi-model ensemble in probabilistic climate projections , 2007, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[33]  K. Trenberth,et al.  A Less Cloudy Future: The Role of Subtropical Subsidence in Climate Sensitivity , 2012, Science.

[34]  Benjamin M. Sanderson,et al.  Recent Progress in Constraining Climate Sensitivity With Model Ensembles , 2015, Current Climate Change Reports.

[35]  S. Bony,et al.  Spread in model climate sensitivity traced to atmospheric convective mixing , 2014, Nature.

[36]  Veronika Eyring,et al.  Evolving Obs4MIPs to Support Phase 6 of the Coupled Model Intercomparison Project (CMIP6) , 2015 .

[37]  Philippe Ciais,et al.  A framework for benchmarking land models , 2012 .

[38]  C. Tebaldi,et al.  Long-term Climate Change: Projections, Commitments and Irreversibility , 2013 .

[39]  Veronika Eyring,et al.  CMIP5 Scientific Gaps and Recommendations for CMIP6 , 2017 .

[40]  Veronika Eyring,et al.  A Strategy for Process-Oriented Validation of Coupled Chemistry- Climate Models , 2005 .

[41]  Yan Zhao,et al.  Evaluation of climate models using palaeoclimatic data , 2012 .

[42]  Robert Pincus,et al.  On Constraining Estimates of Climate Sensitivity with Present-Day Observations through Model Weighting , 2011 .

[43]  Sarah Callaghan,et al.  Documenting Climate Models and Their Simulations , 2013 .

[44]  V. Eyring,et al.  Quantitative evaluation of ozone and selected climate parameters in a set of EMAC simulations , 2014 .

[45]  R. Knutti,et al.  September Arctic sea ice predicted to disappear near 2°C global warming above present , 2011 .

[46]  Kevin W. Bowman,et al.  The impact of orbital sampling, monthly averaging and vertical resolution on climate chemistry model evaluation with satellite observations , 2011 .

[47]  K. Taylor,et al.  Moving beyond the Total Sea Ice Extent in Gauging Model Biases. , 2016, Journal of climate.

[48]  Dean N. Williams Visualization and Analysis Tools for Ultrascale Climate Data , 2014 .

[49]  R. Schaeffer,et al.  Energy sector vulnerability to climate change: A review , 2012 .

[50]  E. Guilyardi,et al.  UNDERSTANDING EL NINO IN OCEAN-ATMOSPHERE GENERAL CIRCULATION MODELS : Progress and Challenges , 2008 .

[51]  D. Nychka,et al.  Consistency of modelled and observed temperature trends in the tropical troposphere , 2008 .

[52]  C. Deser,et al.  Evaluating Modes of Variability in Climate Models , 2014 .

[53]  B. Hewitson,et al.  Good Practice Guidance Paper on Assessing and Combining Multi Model Climate Projections , 2010 .

[54]  Reto Knutti,et al.  Addressing interdependency in a multimodel ensemble by interpolation of model properties , 2015 .

[55]  D. Maraun,et al.  Improving Antarctic Total Ozone Projections by a Process-Oriented Multiple Diagnostic Ensemble Regression , 2013 .

[56]  Patrick Jöckel,et al.  Development cycle 2 of the Modular Earth Submodel System (MESSy2) , 2010 .

[57]  A. P. Siebesma,et al.  Clouds, circulation and climate sensitivity , 2015 .

[58]  Bryan N. Lawrence,et al.  Infrastructure Strategy for the European Earth System Modelling Community 2012-2022 , 2012 .

[59]  Cecelia DeLuca,et al.  Describing Earth system simulations with the Metafor CIM , 2012 .

[60]  Veronika Eyring,et al.  Constraining Future Summer Austral Jet Stream Positions in the CMIP5 Ensemble by Process-Oriented Multiple Diagnostic Regression* , 2016 .

[61]  P. Cox,et al.  Emergent constraints on climate‐carbon cycle feedbacks in the CMIP5 Earth system models , 2014 .

[62]  S. Solomon,et al.  How Often Will It Rain , 2005 .

[63]  Dean N. Williams,et al.  A Global Repository for Planet-Sized Experiments and Observations , 2016 .

[64]  M. Webb,et al.  A quantitative performance assessment of cloud regimes in climate models , 2009 .

[65]  Surendra Byna,et al.  TECA: A Parallel Toolkit for Extreme Climate Analysis , 2012, ICCS.