Tracking climate models

Climate models are complex mathematical models designed by meteorologists, geophysicists, and climate scientists, and run as computer simulations, to predict climate. There is currently high variance among the predictions of 20 global climate models, from various laboratories around the world, that inform the Intergovernmental Panel on Climate Change (IPCC). Given temperature predictions from 20 IPCC global climate models, and over 100 years of historical temperature data, we track the changing sequence of which model predicts best at any given time. We use an algorithm due to Monteleoni and Jaakkola that models the sequence of observations using a hierarchical learner, based on a set of generalized Hidden Markov Models, where the identity of the current best climate model is the hidden variable. The transition probabilities between climate models are learned online, simultaneous to tracking the temperature predictions. On historical global mean temperature data, our online learning algorithm's average prediction loss nearly matches that of the best performing climate model in hindsight. Moreover, its performance surpasses that of the average model prediction, which is the default practice in climate science, the median prediction, and least squares linear regression. We also experimented on climate model predictions through the year 2098. Simulating labels with the predictions of any one climate model, we found significantly improved performance using our online learning algorithm with respect to the other climate models and techniques. To complement our global results, we also ran experiments on IPCC global climate model temperature predictions for the specific geographic regions of Africa, Europe, and North America. On historical data, at both annual and monthly time-scales, and in future simulations, our algorithm typically outperformed both the best climate model per region and linear regression. Notably, our algorithm consistently outperformed the average prediction over models, the current benchmark. © 2011 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 4: 372–392, 2011 © 2011 Wiley Periodicals, Inc.

[1]  R. Toumi,et al.  Climate projections: Past performance no guarantee of future skill? , 2009 .

[2]  Steven de Rooij,et al.  Learning the Switching Rate by Discretising Bernoulli Sources Online , 2009, AISTATS.

[3]  Gábor Lugosi,et al.  Minimizing regret with label efficient prediction , 2004, IEEE Transactions on Information Theory.

[4]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine-mediated learning.

[5]  Mehryar Mohri,et al.  Stability of transductive regression algorithms , 2008, ICML '08.

[6]  Reto Knutti,et al.  The end of model democracy? , 2010 .

[7]  B. D. Santera,et al.  Incorporating model quality information in climate change detection and attribution studies , 2009 .

[8]  A. Raftery,et al.  Using Bayesian Model Averaging to Calibrate Forecast Ensembles , 2005 .

[9]  Nitesh V. Chawla,et al.  An exploration of climate data using complex networks , 2009, SensorKDD '09.

[10]  Upmanu Lall,et al.  Probabilistic multimodel regional temperature change projections , 2006 .

[11]  Upmanu Lall,et al.  Statistical Prediction of ENSO from Subsurface Sea Temperature Using a Nonlinear Dimensionality Reduction , 2009, Journal of Climate.

[12]  Claudio Gentile,et al.  Worst-Case Analysis of Selective Sampling for Linear Classification , 2006, J. Mach. Learn. Res..

[13]  H. L. Miller,et al.  Climate Change 2007: The Physical Science Basis , 2007 .

[14]  Michael K. Tippett,et al.  Skill of Multimodel ENSO Probability Forecasts , 2008 .

[15]  R. Rosenfeld Confidence , 2007, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[16]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[17]  Reto Knutti,et al.  Challenges in Combining Projections from Multiple Climate Models , 2010 .

[18]  Vladimir M. Krasnopolsky,et al.  Decadal Climate Simulations Using Accurate and Fast Neural Network Emulation of Full, Longwave and Shortwave, Radiation* , 2008 .

[19]  Amy Braverman Data Mining for Climate Model Improvement , 2006 .

[20]  Bodo Ahrens,et al.  On the Weighting of Multimodel Ensembles in Seasonal and Short-Range Weather Forecasting , 2009 .

[21]  Vipin Kumar,et al.  Discovery of climate indices using clustering , 2003, KDD '03.

[22]  MonteleoniClaire,et al.  Tracking climate models , 2011 .

[23]  E. Hawkins,et al.  The Potential to Narrow Uncertainty in Regional Climate Predictions , 2009 .

[24]  W. Landman Climate change 2007: the physical science basis , 2010 .

[25]  Paul J. Roebber,et al.  Real-Time Forecasting of Snowfall Using a Neural Network , 2007 .

[26]  Mehryar Mohri,et al.  On Transductive Regression , 2006, NIPS.

[27]  S. Sain,et al.  Combining climate model output via model correlations , 2010 .

[28]  V. Canuto,et al.  Present-Day Atmospheric Simulations Using GISS ModelE: Comparison to In Situ, Satellite, and Reanalysis Data , 2006 .

[29]  Claudio Gentile,et al.  Worst-Case Analysis of Selective Sampling for Linear-Threshold Algorithms , 2004, NIPS.

[30]  D A Stainforth,et al.  Confidence, uncertainty and decision-support relevance in climate predictions , 2007, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[31]  Vladimir M. Krasnopolsky,et al.  Complex hybrid models combining deterministic and machine learning components for numerical climate modeling and weather prediction , 2006, Neural Networks.

[32]  John Langford,et al.  Beating the hold-out: bounds for K-fold and progressive cross-validation , 1999, COLT '99.

[33]  James D. Annan,et al.  Understanding the CMIP3 Multimodel Ensemble , 2011 .

[34]  Robert Pincus,et al.  On Constraining Estimates of Climate Sensitivity with Present-Day Observations through Model Weighting , 2011 .

[35]  David R. Musicant,et al.  Supervised Learning by Training on Aggregate Outputs , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[36]  Vipin Kumar Discovery of Patterns in Global Earth Science Data Using Data Mining , 2010, PAKDD.

[37]  Timothy DelSole,et al.  A Bayesian Framework for Multimodel Regression , 2007 .

[38]  T. Reichler,et al.  How Well Do Coupled Models Simulate Today's Climate? , 2008 .

[39]  Malaquias Peña,et al.  Consolidation of Multimodel Forecasts by Ridge Regression: Application to Pacific Sea Surface Temperature , 2008 .

[40]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[41]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[42]  J. Annan,et al.  Reliability of the CMIP3 ensemble , 2010 .

[43]  Yan Liu,et al.  Spatial-temporal causal modeling for climate change attribution , 2009, KDD.

[44]  Tommi S. Jaakkola,et al.  Online Learning of Non-stationary Sequences , 2003, NIPS.

[45]  Richard L. Smith,et al.  Bayesian Modeling of Uncertainty in Ensembles of Climate Models , 2009 .

[46]  Shie Mannor,et al.  Strategies for Prediction Under Imperfect Monitoring , 2007, Math. Oper. Res..

[47]  Zheng Huang,et al.  The EDAM project: Mining atmospheric aerosol datasets , 2005, Int. J. Intell. Syst..