File access prediction with adjustable accuracy

We describe a novel on-line file access predictor, Recent Popularity, capable of rapid adaptation to workload changes while simultaneously predicting more events with greater accuracy than prior efforts. We distinguish the goal of predicting the most events accurately from the goal of offering the most accurate predictions (when declining to Offer a prediction is acceptable). For this purpose we present two distinct measures of accuracy, general and specific accuracy, corresponding to these goals. We describe how our new predictor and an earlier effort, Noah, can trade the number of events predicted for prediction accuracy by modifying simple parameters. When prediction accuracy is strictly more important than the number of predictions offered, trace-based evaluation demonstrates error rates as low as 2%, while offering predictions for more than 60% of all file access events.

[1]  Robert S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[2]  Carl Staelin,et al.  File system design using large memories , 1990, Proceedings of the 5th Jerusalem Conference on Information Technology, 1990. 'Next Decade in Information Technology'.

[3]  Disconnected operation in the Coda file system , 1991, SOSP '91.

[4]  Dan Duchamp,et al.  Detection and exploitation of file working sets , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[5]  Kenneth M. Curewitz,et al.  Practical Prefetching via Data Compression Practical Prefetching via Data Compression , 1993 .

[6]  Mahadev Satyanarayanan,et al.  Long Term Distributed File Reference Tracing: Implementation and Experience" Technical Report CMU-CS , 1994 .

[7]  Jim Griffioen,et al.  Reducing File System Latency using a Predictive Approach , 1994, USENIX Summer.

[8]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[9]  Todd C. Mowry,et al.  Automatic compiler-inserted I/O prefetching for out-of-core applications , 1996, OSDI '96.

[10]  P. Krishnan,et al.  Optimal prefetching via data compression , 1996, JACM.

[11]  Margo I. Seltzer,et al.  A Comparison of FFS Disk Allocation Policies , 1996, USENIX Annual Technical Conference.

[12]  Hui Lei,et al.  An analytical approach to file prefetching , 1997 .

[13]  Geoffrey H. Kuenning,et al.  Automated hoarding for mobile computers , 1997, SOSP.

[14]  Mahadev Satyanarayanan,et al.  Using dynamic sets to reduce the aggregate latency of data access , 1997 .

[15]  M. Frans Kaashoek,et al.  Embedded Inodes and Explicit Grouping: Exploiting Disk Bandwidth for Small Files , 1997, USENIX Annual Technical Conference.

[16]  Drew Roselli,et al.  Characteristics of File System Workloads , 1998 .

[17]  Darrell D. E. Long,et al.  The case for efficient file access pattern modeling , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[18]  Christopher Small,et al.  Why does file system prefetching work? , 1999, USENIX Annual Technical Conference, General Track.

[19]  Eran Gabber,et al.  Storage Management for Web Proxies , 2001, USENIX Annual Technical Conference, General Track.

[20]  Darrell D. E. Long,et al.  Design and Implementation of a Predictive File Prefetching Algorithm , 2001, USENIX Annual Technical Conference, General Track.

[21]  Ahmed Amer,et al.  Dynamic relationships and the persistence of pairings , 2001, Proceedings 21st International Conference on Distributed Computing Systems Workshops.

[22]  Darrell D. E. Long,et al.  Noah: low-cost file access prediction through pairs , 2001, Conference Proceedings of the 2001 IEEE International Performance, Computing, and Communications Conference (Cat. No.01CH37210).

[23]  Ahmed Amer,et al.  Aggregating caches: A mechanism for implicit file prefetching , 2001, MASCOTS 2001, Proceedings Ninth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.