Disk read-write optimizations and data integrity in transaction systems using write-ahead logging

We discuss several disk read-write optimizations that are implemented in different transaction systems and disk hardware to improve performance. These include: (1) when multiple sectors are written to disk, the sectors may be written out of sequence (SCSI disk interfaces do this). (2) Avoiding initializing pages on disk when a file is extended. (3) Not accessing individual pages during a mass delete operation (e.g., dropping an index from a file which contains multiple indexes). (4) Permitting a previously deallocated page to be reallocated without the need to read the deallocated version of the page from disk during its reallocation. (5) Purging of file pages from the buffer pool during a file erase operation (e.g., a table drop). (6) Avoiding logging for bulk operations like index create. We consider a system which implements the above optimizations and in which a page consists of multiple disk sectors and recovery is based on write-ahead logging using a log sequence number on every page. For such a system, we present a simple method for guaranteeing the detection of the partial disk write of a page. Detecting partial writes is very important not only to ensure data integrity from the users' viewpoint but also to make the transaction system software work correctly. Once a partial write is detected, it is easy to recover such a page using media recovery techniques. Our method imposes minimal CPU and space overheads. It has been implemented in DB2/6000 and ADSM.<<ETX>>