A database server for next-generation scientific data management

The growth of scientific information and the increasing automation of data collection have made databases integral to many scientific disciplines including life sciences, physics, meteorology, earth and atmospheric sciences, and chemistry. These sciences pose new data management challenges to current database system technologies. The thesis work presented in this paper proposes a database server for next-generation scientific data management. The proposed sever realizes two core requirements in scientific databases, mainly, (1) Annotation management, and (2) Complex dependencies involving human actions. In the paper, we discuss the challenges involved in each of these requirements and present the key contributions and main results in each of the two fronts.

[1]  Umeshwar Dayal,et al.  Organizing long-running activities with triggers and transactions , 1990, SIGMOD '90.

[2]  Klaus R. Dittrich,et al.  How to share work on shared objects in design databases , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[3]  Umeshwar Dayal,et al.  Active Database Management Systems , 1988, JCDKB.

[4]  Jennifer Widom,et al.  Behavior of database production rules: termination, confluence, and observable determinism , 1992, SIGMOD '92.

[5]  H. V. Jagadish,et al.  Database management for life sciences research , 2004, SGMD.

[6]  James Cheney,et al.  Curated databases , 2008, PODS.

[7]  Mario Piattini,et al.  State of the Art in Fuzzy Database Modeling , 2006 .

[8]  Wang Chiew Tan Containment of Relational Queries with Annotation Propagation , 2003, DBPL.

[9]  Henrico Dolfing,et al.  MONDRIAN: Annotating and querying databases through colors and blocks , 2006 .

[10]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[11]  James Cheney,et al.  Provenance management in curated databases , 2006, SIGMOD Conference.

[12]  Jennifer Widom,et al.  Lineage tracing for general data warehouse transformations , 2003, The VLDB Journal.

[13]  Michael Stonebraker,et al.  Requirements for Science Data Bases and SciDB , 2009, CIDR.

[14]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[15]  Wing-Kai Hon,et al.  The SBC-tree: an index for run-length compressed sequences , 2008, EDBT '08.

[16]  Walid G. Aref,et al.  Managing Biological Data using bdbms , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[18]  Wang Chiew Tan,et al.  An annotation management system for relational databases , 2004, The VLDB Journal.

[19]  Terhi Töyli,et al.  bdbms - A Database Management System for Biological Data , 2008 .

[20]  Mohamed F. Mokbel,et al.  Transaction Time Support Inside a Database Engine , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  Adriane Chapman,et al.  Efficient provenance storage , 2008, SIGMOD Conference.

[22]  DayalUmeshwar,et al.  Organizing long-running activities with triggers and transactions , 1990 .

[23]  Walid G. Aref,et al.  Supporting real-world activities in database management systems , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[24]  Jan Van den Bussche,et al.  Relational Completeness of Query Languages for Annotated Databases , 2007, DBPL.

[25]  José Galindo,et al.  Fuzzy Databases: Modeling, Design, and Implementation , 2006 .

[26]  Michael Stonebraker,et al.  A Demonstration of SciDB: A Science-Oriented DBMS , 2009, Proc. VLDB Endow..

[27]  Sanjeev Khanna,et al.  Edinburgh Research Explorer On the Propagation of Deletions and Annotations through Views , 2013 .

[28]  V. Vianu,et al.  Edinburgh Why and Where: A Characterization of Data Provenance , 2017 .

[29]  Jennifer Widom,et al.  Trio: A System for Integrated Management of Data, Accuracy, and Lineage , 2004, CIDR.

[30]  Wang Chiew Tan,et al.  DBNotes: a post-it system for relational databases based on provenance , 2005, SIGMOD '05.

[31]  David J. DeWitt,et al.  Scientific data management in the coming decade , 2005, SGMD.

[32]  Walid G. Aref,et al.  Supporting annotations on relations , 2009, EDBT '09.