Coordinating backup/recovery and data consistency between database and file systems

Managing a combined store consisting of database data and file data in a robust and consistent manner is a challenge for database systems and content management systems. In such a hybrid system, images, videos, engineering drawings, etc. are stored as files on a file server while meta-data referencing/indexing such files is created and stored in a relational database to take advantage of efficient search. In this paper we describe solutions for two potentially problematic aspects of such a data management system: backup/recovery and data consistency. We present algorithms for performing backup and recovery of the DBMS data in a coordinated fashion with the files on the file servers. Our algorithms for coordinated backup and recovery have been implemented in the IBM DB2/DataLinks product [1]. We also propose an efficient solution to the problem of maintaining consistency between the content of a file and the associated meta-data stored in the DBMS from a reader's point of view without holding long duration locks on meta-data tables. In the model, an object is directly accessed and edited in-place through normal file system APIs using a reference obtained via an SQL Query on the database. To relate file modifications to meta-data updates, the user issues an update through the DBMS, and commits both file and meta-data updates together.