The goal of this study is design of a database holding archives of weather forecast systems. The detailed description of the project and analysis of the database performance is presented, along with the experimental performance tests of key algorithms. A standard mode of data access is optimal for a sequential access to individual 2D grids (time-local and spatially-global), whereas new applications require reading long time series of localised data (spatially-local and time-global). The design goal is to increase a performance of access to archival spatially-local-time-global data, without visible degradation of performance of the standard access mode. The database is designed as two separate layers. The Format Translation Layer (FTL) is an interface between the database and the file-based output of the simulation and analysis programs. The Distributed Data Storage Layer (DDSL) is responsible for a secure data storage and an efficient access. The FTL reads output forecasts and converts them to the spatially local format. It splits large 2D arrays of data into small patches and forms 3D arrays, using time as the third dimension. The theoretical analysis of the performance shows that four orders of magnitudes improvement in comparison with the standard serial access and two orders of magnitude in comparison with parallelised version of the spatially-global access can be achieved.
[1]
I. Terekhov,et al.
Meta-computing at D0
,
2003
.
[2]
Michael Stonebraker,et al.
A Demonstration of SciDB: A Science-Oriented DBMS
,
2009,
Proc. VLDB Endow..
[3]
Mike Folk,et al.
Balancing performance and preservation lessons learned with HDF5
,
2010,
US-DPIF '10.
[4]
Piotr Synak,et al.
Brighthouse: an analytic data warehouse for ad-hoc queries
,
2008,
Proc. VLDB Endow..
[5]
S. Salzberg,et al.
Bioinformatics challenges of new sequencing technology.
,
2008,
Trends in genetics : TIG.
[6]
Jacek Becla,et al.
Report from the 6th Workshop on Extremely Large Databases
,
2013,
Data Sci. J..
[7]
L. Lueking,et al.
SAM and the Particle Physics Data Grid
,
2001
.
[8]
R. Hodur.
The Naval Research Laboratory’s Coupled Ocean/Atmosphere Mesoscale Prediction System (COAMPS)
,
1997
.
[9]
T. N. Bhat,et al.
The Protein Data Bank
,
2000,
Nucleic Acids Res..
[10]
A. Staniforth,et al.
A new dynamical core for the Met Office's global and regional modelling of the atmosphere
,
2005
.
[11]
N. O. Manning,et al.
The protein data bank
,
1999,
Genetica.