Supporting real-world activities in database management systems

The cycle of processing the data in many application domains is complex and may involve real-world activities that are external to the database, e.g., wet-lab experiments, instrument readings, and manual measurements. These real-world activities may take long time to prepare for and to perform, and hence introduce inherently long time delays between the updates in the database. The presence of these long delays between the updates, along with the need for the intermediate results to be instantly available, makes supporting real-world activities in the database engine a challenging task. In this paper, we address these challenges through a system that enables users to reflect their updates immediately into the database while keeping track of the dependent and potentially invalid data items until they are re-validated. The proposed system includes: (1) semantics and syntax for interfaces through which users can express the dependencies among data items, (2) new operators to alert users when the returned query results contain potentially invalid or out-of-date data, and to enable evaluating queries on either valid data only, or both valid and potentially invalid data, and (3) mechanisms for data invalidation and revalidation. The proposed system is being realized via extensions to PostgreSQL.

[1]  Jennifer Widom,et al.  Trio: A System for Integrated Management of Data, Accuracy, and Lineage , 2004, CIDR.

[2]  Johannes Gehrke,et al.  Cayuga: A General Purpose Event Monitoring System , 2007, CIDR.

[3]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[4]  James Cheney,et al.  Provenance management in curated databases , 2006, SIGMOD Conference.

[5]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[6]  Michael Stonebraker,et al.  The Implementation of Postgres , 1990, IEEE Trans. Knowl. Data Eng..

[7]  Sanjeev Khanna,et al.  Why and Where: A Characterization of Data Provenance , 2001, ICDT.

[8]  Jennifer Widom,et al.  Set-oriented production rules in relational database systems , 1990, SIGMOD '90.

[9]  Lan Huang,et al.  Scalable trigger processing , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[10]  Wenfei Fan,et al.  Conditional functional dependencies for capturing data inconsistencies , 2008, TODS.

[11]  Umeshwar Dayal,et al.  Active Database Management Systems , 1988, JCDKB.

[12]  Walid G. Aref,et al.  bdbms - A Database Management System for Biological Data , 2007, CIDR.

[13]  Walid G. Aref,et al.  Managing Biological Data using bdbms , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[14]  Johannes Gehrke,et al.  What is "next" in event processing? , 2007, PODS.

[15]  Mukesh K. Mohania,et al.  Functional Dependency Driven Auxiliary Relation Selection for Materialized Views Maintenance , 2005, COMAD.

[16]  Laura Bocchi,et al.  A Calculus for Long-Running Transactions , 2003, FMOODS.

[17]  Wang Chiew Tan,et al.  An annotation management system for relational databases , 2004, The VLDB Journal.

[18]  Jennifer Widom,et al.  Active Database Systems: Triggers and Rules For Advanced Database Processing , 1994 .

[19]  Umeshwar Dayal,et al.  Organizing long-running activities with triggers and transactions , 1990, SIGMOD '90.

[20]  José Galindo,et al.  Fuzzy Databases: Modeling, Design, and Implementation , 2006 .

[21]  Jennifer Widom,et al.  Lineage tracing for general data warehouse transformations , 2003, The VLDB Journal.