Improved adaptive replacement algorithm for disk caches in HSM systems

With an ever increasing amount of data to store, hierarchical storage management (HSM) systems must still use tape for tertiary storage. A disk cache is used to reduce the access time to data stored on tapes in a robot device. Due to the sequential access to tape devices, some HSM systems will transfer whole files between disk cache and tape. In this case the disk cache is forced to handle storage objects of nonuniform data size. In this article the term 'object' is used initially to emphasize that size is a property of the data stored in the disk cache. Thereafter files will be called cache objects and the disk cache will be called object cache. When dealing with file objects in a HSM system disk cache, size is not the only property that influences object replacement. A new replacement algorithm called ObjectLRU (OLRU) is introduced that considers different object properties for replacement. Using file system traces and cache simulation, the performance of OLRU is evaluated. Compared to the LRU replacement algorithm, the OLRU replacement improved cache hit rates for all simulated cache sizes. The gap between hit rates for the LRU and OPT replacement algorithms, which ranges between 3.2 and 0.7 percent, is reduced to between 1.9 and 0.6 percent. An online optimization of OLRU parameters is used to increase the adaptiveness of the OLRU algorithm by utilizing a genetic algorithm.