Cloud archiving and data mining of High-Resolution Rapid Refresh forecast model output

Weather-related research often requires synthesizing vast amounts of data that need archival solutions that are both economical and viable during and past the lifetime of the project. Public cloud computing services (e.g., from Amazon, Microsoft, or Google) or private clouds managed by research institutions are providing object data storage systems potentially appropriate for long-term archives of such large geophysical data sets. We illustrate the use of a private cloud object store developed by the Center for High Performance Computing (CHPC) at the University of Utah. Since early 2015, we have been archiving thousands of two-dimensional gridded fields (each one containing over 1.9 million values over the contiguous United States) from the High-Resolution Rapid Refresh (HRRR) data assimilation and forecast modeling system. The archive is being used for retrospective analyses of meteorological conditions during high-impact weather events, assessing the accuracy of the HRRR forecasts, and providing initial and boundary conditions for research simulations. The archive is accessible interactively and through automated download procedures for researchers at other institutions that can be tailored by the user to extract individual two-dimensional grids from within the highly compressed files. Characteristics of the CHPC object storage system are summarized relative to network file system storage or tape storage solutions. The CHPC storage system is proving to be a scalable, reliable, extensible, affordable, and usable archive solution for our research. Display Omitted High resolution weather model output is archived in an object data storage system.Object storage is an affordable, useable, and reliable long-term archive solution.High impact weather events used to illustrate efficient data retrieval from archive.Model output archive makes it possible to initialize weather research simulations.

[1]  R. Stewart,et al.  A Numerical Study of the June 2013 Flood-Producing Extreme Rainstorm over Southern Alberta , 2016 .

[2]  Gábor Terstyánszky,et al.  Buttressing volatile desktop grids with cloud resources within a reconfigurable environment service for workflow orchestration , 2014, Journal of Cloud Computing.

[3]  D. C. Bowman,et al.  Near real time weather and ocean model data access with rNOMADS , 2015, Comput. Geosci..

[4]  Charles S. Zender,et al.  The compression–error trade-off for large gridded data sets , 2017 .

[5]  Karl E. Taylor,et al.  An overview of CMIP5 and the experiment design , 2012 .

[6]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[7]  M. Steiner,et al.  Examination of Mixed-Phase Precipitation Forecasts from the High-Resolution Rapid Refresh Model Using Surface Observations and Sounding Data , 2017 .

[8]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[9]  M D Poat,et al.  POSIX and Object Distributed Storage Systems Performance Comparison Studies With Real-Life Scenarios in an Experimental Data Taking Context Leveraging OpenStack Swift & Ceph , 2015 .

[10]  Brian K. Blaylock,et al.  Summer ozone concentrations in the vicinity of the Great Salt Lake , 2016 .

[11]  Bradley Zavodsky,et al.  Clouds in the Cloud: Weather Forecasts and Applications within Cloud Computing Environments , 2015 .

[12]  G. Grell,et al.  A North American Hourly Assimilation and Model Forecast Cycle: The Rapid Refresh , 2016 .

[13]  J. Horel,et al.  MESOWEST: COOPERATIVE MESONETS IN THE WESTERN UNITED STATES , 2002 .

[14]  Brian K. Blaylock,et al.  Impact of Lake Breezes on Summer Ozone Concentrations in the Salt Lake Valley , 2017 .

[15]  John L. Schroeder,et al.  The West Texas Mesonet: A Technical Overview , 2005 .

[16]  J. Horel,et al.  Large-eddy simulations of a Salt Lake Valley cold-air pool , 2017 .

[17]  David C. Chou,et al.  Cloud computing: A value creation model , 2015, Comput. Stand. Interfaces.

[18]  Michael S. Warren,et al.  Building a living atlas of the Earth in the cloud , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[19]  J. Horel,et al.  The Earthscope US transportable array 1 Hz surface pressure dataset , 2016 .

[20]  J. Horel,et al.  Simulations of a Cold-Air Pool in Utah’s Salt Lake Valley: Sensitivity to Land Use and Snow Cover , 2017, Boundary-Layer Meteorology.

[21]  Stanley G. Benjamin,et al.  A unified high-resolution wind and solar dataset from a rapidly updating numerical weather prediction model , 2017 .

[22]  Ewa Deelman,et al.  Performance Analysis of an I/O-Intensive Workflow Executing on Google Cloud and Amazon Web Services , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[23]  Dongman Lee,et al.  Notes on Cloud computing principles , 2014, Journal of Cloud Computing.

[24]  Anton Kruger,et al.  Building a terabyte NEXRAD radar database for hydrometeorology research , 2006, Comput. Geosci..

[25]  Jordan G. Powers,et al.  The Weather Research and Forecasting Model: Overview, System Efforts, and Future Directions , 2017 .

[26]  The Influence of Topography on Convective Storm Environments in the Eastern United States as Deduced from the HRRR , 2016 .