Cumulo: A Dataset for Learning Cloud Classes

One of the greatest sources of uncertainty in future climate projections comes from limitations in modelling clouds and in understanding how different cloud types interact with the climate system. A key first step in reducing this uncertainty is to accurately classify cloud types at high spatial and temporal resolution. In this paper, we introduce Cumulo, a benchmark dataset for training and evaluating global cloud classification models. It consists of one year of 1km resolution MODIS hyperspectral imagery merged with pixel-width 'tracks' of CloudSat cloud labels. Bringing these complementary datasets together is a crucial first step, enabling the Machine-Learning community to develop innovative new techniques which could greatly benefit the Climate community. To showcase Cumulo, we provide baseline performance analysis using an invertible flow generative model (IResNet), which further allows us to discover new sub-classes for a given cloud class by exploring the latent space. To compare methods, we introduce a set of evaluation criteria, to identify models that are not only accurate, but also physically-realistic.

[1]  G. Stephens Cloud Feedbacks in the Climate System: A Critical Review , 2005 .

[2]  A. P. Siebesma,et al.  Clouds, circulation and climate sensitivity , 2015 .

[3]  O. Boucher,et al.  Why Does Aerosol Forcing Control Historical Global-Mean Surface Temperature Change in CMIP5 Models? , 2015 .

[4]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  R. Wood,et al.  Climatology of stratocumulus cloud morphologies: microphysical properties and radiative effects , 2014 .

[6]  G. Hegerl,et al.  Beyond equilibrium climate sensitivity , 2017 .

[7]  T. Schneider,et al.  Possible climate transitions from breakup of stratocumulus decks under greenhouse warming , 2019, Nature Geoscience.

[8]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[9]  W. Rossow,et al.  ISCCP Cloud Data Products , 1991 .

[10]  Kenneth Sassen,et al.  Classifying clouds around the globe with the CloudSat radar: 1‐year of results , 2008 .

[11]  Yee Whye Teh,et al.  Hybrid Models with Deep and Invertible Features , 2019, ICML.

[12]  Fan Yu,et al.  Development of a high spatiotemporal resolution cloud-type classification approach using Himawari-8 and CloudSat , 2019, International Journal of Remote Sensing.

[13]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  K. Sassen,et al.  Global distribution of cirrus clouds from CloudSat/Cloud‐Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) measurements , 2008 .

[15]  Stephan Rasp,et al.  Combining crowd-sourcing and deep learning to understand meso-scale organization of shallow convection , 2019, Bulletin of the American Meteorological Society.

[16]  Dong Liu,et al.  Cirrus clouds and deep convection in the tropics: Insights from CALIPSO and CloudSat , 2009 .

[17]  George Papandreou,et al.  Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation , 2015, ArXiv.

[18]  Philip H. S. Torr,et al.  Weakly- and Semi-Supervised Panoptic Segmentation , 2022 .

[19]  G. Mace,et al.  Association of Tropical Cirrus in the 10–15-km Layer with Deep Convective Sources: An Observational Study Combining Millimeter Radar Data and Satellite-Derived Trajectories , 2006 .