Data placement techniques for serpentine tapes

Due to the information explosion, a growing number of applications store, maintain, and retrieve large volumes of data, where the data is required to be available online or near-online. These data repositories are implemented using hierarchical storage structures (HSS). One of the components of HSS is tertiary storage, which provides cost-effective storage for the vast amount of data manipulated by these applications. However it is crucial that the 3-4 orders of magnitude difference in access time between tertiary storage and secondary storage be bridged to allow online or near-online access to the tertiary resident data. This wide access-gap is mainly due to: the sequential nature of the most popular tertiary technologies (i.e., tapes) and the low number of drives per media in tertiary storage jukeboxes. In this paper we propose a novel data placement technique specifically designed for the serpentine tape technology, namely: wrap around data placement (WARP). We focus on tape technology because it provides the most cost-effective storage for very large databases, and more specifically on serpentine tapes because they are increasingly the technology of choice for mid-range and high-end systems. WARP may reduce the access time by 1 order of magnitude, depending on the tape device specifications and object sizes. An important feature of WARP is that it optimizes access-time independently of the retrieval order. This is achieved by exploiting the serpentine tape technology characteristics as opposed to the application characteristics.

[1]  Peter Triantafillou,et al.  On-Demand Data Elevation in Hierarchical Multimedia Storage Servers , 1997, VLDB.

[2]  Abraham Silberschatz,et al.  On the modeling and performance characteristics of a serpentine tape drive , 1996, SIGMETRICS '96.

[3]  Abraham Silberschatz,et al.  Random I/O scheduling in online tertiary storage systems , 1996, SIGMOD '96.

[4]  Cyrus Shahabi,et al.  Pipelining Mechanism to Minimize the Latency Time in Hierarchical Multimedia Storage Managers , 1995, Comput. Commun..

[5]  Jeffrey Scott Vitter,et al.  Strategic directions in storage I/O issues in large-scale computing , 1996, CSUR.

[6]  Jussi Myllymaki,et al.  Disk-tape joins: synchronizing disk and tape access , 1995, SIGMETRICS '95/PERFORMANCE '95.

[7]  Garth A. Gibson,et al.  Report of the Working Group on Storage I/O for Large-Scale Computing , 1996 .

[8]  Michael Stonebraker,et al.  Database systems: achievements and opportunities , 1990, SGMD.

[9]  Arie Shoshani,et al.  Efficient organization and access of multi-dimensional datasets on tertiary storage systems , 1995, Inf. Syst..

[10]  Randy H. Katz,et al.  Striped tape arrays , 1993, [1993] Proceedings Twelfth IEEE Symposium on Mass Storage systems.

[11]  Laura M. Haas,et al.  Tapes hold data, too: challenges of tuples on tertiary store , 1993, SIGMOD '93.

[12]  Sunita Sarawagi,et al.  Query Processing in Tertiary Memory Databases , 1995, VLDB.

[13]  J. Nunamaker,et al.  Proceedings of the 32nd Hawaii International Conference on System Sciences , 1999 .

[14]  Cyrus Shahabi,et al.  Continuous Media Retrieval Optimizer for Hierarchical Storage Structures , 1998, IADT.

[15]  Michael Stonebraker,et al.  Efficient organization of large multidimensional arrays , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[16]  Stavros Christodoulakis,et al.  Principles of Optimally Placing Data in Tertiary Storage Libraries , 1997, VLDB.

[17]  Shahram Ghandeharizadeh,et al.  On Configuring Hierarchical Storage Structures , 1998 .

[18]  Gerhard Weikum,et al.  Vertical Data Migration in Large Near-Line Document Archives Based on Markov-Chain Predictions , 1997, VLDB.