Long-term unix file system activity and the efficacy of automatic file migration
暂无分享,去创建一个
This dissertation studies long-term file system activity in order to develop new migration algorithms for tertiary storage systems (e.g., tape robots). To enable this study, I collected file traces from four different computing facilities and analyzed file activity patterns in the areas of size, usage, access, creation, deletion, and modification rates and patterns, inter-reference periods, and lifetimes. My file activity results include: most files are never used, accesses dominate file activity, files which are modified grow or shrink very little, file activity has reference locality, and there are significant differences between how the Unix operating system deals with files and what users perceive the operating system does with their files (i.e., the Unix operating system's numeric index versus the hierarchical name space used by people).
I also analyzed the collected data for self-similar behavior, and was able to show that file system traffic is self-similar, or fractal. This has profound implications for computer simulations and modeling because normal simulation assumptions (i.e., Poisson models with finite variance arrival rates) are invalid for self-similar traffic. Thus, most existing file system simulators and models are probably inaccurate.
Finally, I developed a new migration algorithm that is an order of magnitude more effective than any existing migration algorithm when measured using on-disk misses. Other measures of effectiveness, such as number of files and bytes move to tertiary storage, and the number of forced mid-day migrations are also improved.