DRAFT: COMMENTS SOLICITED Transformation Catalog Design for GriPhyN
暂无分享,去创建一个
As part of the GriPhyn project (www.griphyn.org) we are developing a transformation catalog, which will be used to store information about the transformations which need to be used in order to process data as requested by the user. The catalog can be used to process raw or derived data products. Data derived using the transformations will be possibly placed in the Metadata catalog or the Replica Catalog[1] . The transformation catalog will be evaluated in the context of the LIGO prototype. The existing prototype is able to take an input of channel names and starting and ending time and then can query the replica catalog for the location of the requested data. If the files are not found, computation used to produce them is scheduled on LIGO’s computational resources. The granularity of LIGO data frames is set to 50sec time frames. Frames can be concatenated to form larger time frames. The results are provided to the user as a URL. The newly produced 50 sec frames are then entered into the replica catalog for future reference. The concatenated file returned to the user is not reflected in the replica catalog. Currently, the prototype uses only two transformations. By using the transformation catalog, the user will be able to ask for a rich set of derived data. This will be another step towards the realization of the concept of Virtual Data, where the user/application can request data whether it is materialized or not. If the materialized data is not available or the cost of accessing the processed data is greater then the cost of applying the transformation then we need obtain information about where the appropriate transformation is located. This document aims to specify a schema for a transformation catalog, which holds information regarding the transformation files and their location.