Statistical Metadata Modeling and Transformations

The term metadata is frequently used in many different sciences. Statistical metadata generally used to denote “every piece of information required by a data user to properly understand and use statistical data.” Modern statistical information systems (SIS) use metadata in relational or complex object-oriented metadata models, making an extensive and active usage of metadata. Early phases of many software development projects emphasize the design of a conceptual data/metadata model. Such a design can be detailed into a logical data/metadata model. In later stages, this model may be translated into physical data/metadata model. Organisations aspects, user requirements and constraints created by existing data warehouse architecture lead to a conceptual architecture for metadata management, based on a common, semantically rich, object-oriented data/metadata model, integrating the main steps of data processing and covering all aspects of data warehousing (Pool et al, 2002). In this paper we examine data/metadata modeling according to the techniques and paradigms used for metadata schemas development. However, only the integration of a model into a SIS is not sufficient for automatic manipulation of related datasets and quality assurance, if not accompanied by certain operators/ transformations. Two types of transformations can be considered: (i) the ones used to alleviate breaks in the time series and (ii) a set of model-integrated operators for automating data/metadata management and minimizing human errors. This latter category is extensively discussed. Finally, we illustrate the applicability of our scientific framework in the area of Biomedical statistics.

[1]  Cynthia Brandt,et al.  Application of Information Technology: Metadata-driven Ad Hoc Query of Patient Data: Meeting the Needs of Clinical Studies , 2002, J. Am. Medical Informatics Assoc..

[2]  Arie Shoshani,et al.  Multidimensionality in Statistical, OLAP, and Scientific Databases , 2003, Multidimensional Databases.

[3]  Maria Vardaki,et al.  Statistical Data and Metadata Quality Assessment , 2008 .

[4]  Chunhua Weng,et al.  User-centered semantic harmonization: A case study , 2007, J. Biomed. Informatics.

[5]  Michalis Petrakos,et al.  A Statistical Metadata Model for Simultaneous Manipulation of both Data and Metadata , 2004, Journal of Intelligent Information Systems.

[6]  Elaheh Pourabbas,et al.  The Composite OLAP-Object Data Model: Removing an Unnecessary Barrier , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).