s: A Latency-Hiding Technique for High-Capacity Mass-Storage Systems 1 of 20 Abstracts: A Latency-Hiding Technique for High-Capacity Mass-Storage Systems Abstract Extraordinary advances in digital storage technology are rapidly making possible cost-effective, multiple-terabyte information retrieval systems. The latency and bandwidth of these technologies are typically much worse than what users of computer systems are accustomed to. Unfortunately, traditional techniques of reducing latency and improving bandwidth, caching and compression, by themselves will not work well with the access patterns that we anticipate for these high-capacity systems.Extraordinary advances in digital storage technology are rapidly making possible cost-effective, multiple-terabyte information retrieval systems. The latency and bandwidth of these technologies are typically much worse than what users of computer systems are accustomed to. Unfortunately, traditional techniques of reducing latency and improving bandwidth, caching and compression, by themselves will not work well with the access patterns that we anticipate for these high-capacity systems. We introduce and define a new storage management technique, called abstracts. An abstract is an extraction of the “essential” part of the data set. It is created using some combination of averaging, subsetting, rounding, or some other method of condensing the data. An abstract’s composition is heavily dependent on the context in which it is used. Each data set can have multiple abstracts associated with it, each of which can be used to answer a different subset of the possible queries that might be posed about the data. When we are able to answer a query from an abstract, effective bandwidth increases, because we transfer much less data through the storage system. The counter-intuitive result is that abstracts on robot-based tape storage systems can have lower latency than full data sets on magnetic disks, because the inherent latency disadvantage of tertiary systems can be overcome by the reduction in transfer time due to the smaller transfer size. Moreover, because many abstracts can fit in faster storage in the space occupied by a single unabstracted data set, users can get the effect of magnetic disk latencies for very large objects. To evaluate the potential of abstracts, we examine four common queries as well as a detailed case study. We also study the statistical characteristics of several data sets in an effort to identify classes of abstracting functions.
[1]
Thomas R. Gross,et al.
Combining the concepts of compression and caching for a two-level filesystem
,
1991,
ASPLOS IV.
[2]
K. Kavi.
Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance
,
2022
.
[3]
H. K. Ramapriyan,et al.
Planning For The Eos Data and Information System (EOSDIS)
,
1991
.
[4]
D.A. Patterson,et al.
An approach to cost-effective terabyte memory systems
,
1992,
Digest of Papers COMPCON Spring 1992.
[5]
John A. Kunze,et al.
A trace-driven analysis of the UNIX 4.2 BSD file system
,
1985,
SOSP '85.
[6]
Butler W. Lampson,et al.
On-line data compression in a log-structured file system
,
1992,
ASPLOS V.
[7]
Brian N. Bershad,et al.
Watchdogs - Extending the UNIX File System
,
1988,
Comput. Syst..
[8]
J. Dozier.
Spectral Signature of Alpine Snow Cover from the Landsat Thematic Mapper
,
1989
.
[9]
Abraham Lempel,et al.
A universal algorithm for sequential data compression
,
1977,
IEEE Trans. Inf. Theory.
[10]
Mary Baker,et al.
Measurements of a distributed file system
,
1991,
SOSP '91.
[11]
Michael Stonebraker,et al.
LARGE CAPACITY OBJECT SERVERS TO SUPPORT GLOBAL CHANGE RESEARCH
,
2000
.
[12]
Didier Le Gall,et al.
MPEG: a video compression standard for multimedia applications
,
1991,
CACM.
[13]
Alan Jay Smith,et al.
Long term file migration: development and evaluation of algorithms
,
1981,
CACM.
[14]
Pierre Jouvelot,et al.
Semantic file systems
,
1991,
SOSP '91.
[15]
Gregory K. Wallace,et al.
The JPEG still picture compression standard
,
1992
.