Caching Hints in Distributed Systems

Caching reduces the average cost of retrieving data by amortizing the lookup cost over several references to the data. Problems with maintaining strong cache consistency in a distributed system can be avoided by treating cached information as hints. A new approach to managing caches of hints suggests maintaining a minimum level of cache accuracy, rather than maximizing the cache hit ratio, in order to guarantee performance improvements. The desired accuracy is based on the ratio of lookup costs to the costs of detecting and recovering from invalid cache entries. Cache entries are aged so that they get purged when their estimated accuracy falls below the desired level. The age thresholds are dictated solely by clients' accuracy requirements instead of being suggested by data storage servers or system administrators.