Long-Term file activity patterns in a UNIX workstation environment

As mass storage technology becomes more affordable for sites smaller than supercomputer centers, understanding their file access patterns becomes crucial for developing systems to store rarely used data on tertiary storage devices such as tapes and optical disks. This paper presents a new way to collect and analyze file system statistics for UNIX-based file systems. The collection system runs in user-space and requires no modification of the operating system kernel. The statistics package provides details about file system operations at the file level: creations, deletions, modifications, etc. The paper analyzes four months of file system activity on a university file system. The results confirm previously published results gathered from supercomputer file systems, but differ in several important areas. Files in this study were considerably smaller than those at supercomputer centers, and they were accessed less frequently. Additionally, the long-term creation rate on workstation file systems is sufficiently low so that all data more than a day old could be cheaply saved on a mass storage device, allowing the integration of time travel into every file system.

[1]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[2]  John A. Kunze,et al.  A trace-driven analysis of the UNIX 4.2 BSD file system , 1985, SOSP '85.

[3]  SmithAlan Jay Long term file migration: development and evaluation of algorithms , 1981 .

[4]  Maurice J. Bach The Design of the UNIX Operating System , 1986 .

[5]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[6]  John Merrill,et al.  Early experience with mass storage on a Unix-based supercomputer , 1990, [1990] Digest of papers. Tenth IEEE Symposium on Mass Storage Systems@m_Crisis in Mass Storage.

[7]  Randy H. Katz,et al.  Analyzing the I/O behavior of supercomputer applications , 1991, [1991] Digest of Papers Eleventh IEEE Symposium on Mass Storage Systems.

[8]  Alan Jay Smith Analysis of Long Term File Reference Patterns for Application to File Migration Algorithms , 1981, IEEE Transactions on Software Engineering.

[9]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[10]  Randy H. Katz,et al.  An Analysis of File Migration in a UNIX Supercomputing Environment , 1993, USENIX Winter.

[11]  Samuel J. Leffler,et al.  The design and implementation of the 4.3 BSD Unix operating system , 1991, Addison-Wesley series in computer science.

[12]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[13]  Alan Jay Smith,et al.  Long term file migration: development and evaluation of algorithms , 1981, CACM.

[14]  Daniel A. Reed,et al.  File archive activity in a supercomputing environment , 1993, ICS '93.