Caching reduces the average cost of retrieving data by amortizing the lookup cost over several references to the data. Problems with maintaining strong cache consistency in a distributed system can be avoided by treating cached information as hints. A new approach to managing caches of hints suggests maintaining a minimum level of cache accuracy, rather than maximizing the cache hit ratio, in order to guarantee performance improvements. The desired accuracy is based on the ratio of lookup costs to the costs of detecting and recovering from invalid cache entries. Cache entries are aged so that they get purged when their estimated accuracy falls below the desired level. The age thresholds are dictated solely by clients' accuracy requirements instead of being suggested by data storage servers or system administrators.
[1]
Mahadev Satyanarayanan,et al.
A study of file sizes and functional lifetimes
,
1981,
SOSP.
[2]
Laura M. Haas,et al.
Computation and communication in R*: a distributed database manager
,
1984,
TOCS.
[3]
Marvin H. Solomon,et al.
The CSNET Name Server
,
1982,
Comput. Networks.
[4]
Butler W. Lampson,et al.
Hints for Computer System Design
,
1983,
IEEE Software.
[5]
Mahadev Satyanarayanan,et al.
The ITC distributed file system: principles and design
,
1985,
SOSP '85.
[6]
Roger M. Needham,et al.
Grapevine: an exercise in distributed computing
,
1982,
CACM.
[7]
Douglas Brian Terry,et al.
Distributed name servers: naming and caching in large distributed computing environments
,
1985
.