A Trace-Based Study of SMB Network File System Workloads in an Academic Enterprise

Storage system traces are important for examining real-world applications, studying potential bottlenecks, as well as driving benchmarks in the evaluation of new system designs. While file system traces have been well-studied in earlier work, it has been some time since the last examination of the SMB network file system. The purpose of this work is to continue previous SMB studies to better understand the use of the protocol in a real-world production system in use at the University of Connecticut. The main contribution of our work is the exploration of I/O behavior in modern file system workloads as well as new examinations of the inter-arrival times and run times for I/O events. We further investigate if the recent standard models for traffic remain accurate. Our findings reveal interesting data relating to the number of read and write events. We notice that the number of read and write events is significantly less than creates and the average number of bytes exchanged per I/O is much smaller than what has been seen in previous studies. Furthermore, we find an increase in the use of metadata for overall network communication that can be taken advantage of through the use of smart storage devices.

[1]  August 29-September 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems , 2000, Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728).

[2]  Tamás Skopkó Loss Analysis of the Software-based Packet Capturing , 2012 .

[3]  William J. Bolosky,et al.  A large-scale study of file-system contents , 1999, SIGMETRICS '99.

[4]  Eric Anderson,et al.  DataSeries: an efficient, flexible data format for structured serial data , 2009, OPSR.

[5]  Michael Vrable,et al.  Cumulus: Filesystem backup to the cloud , 2009, TOS.

[6]  John Wilkes,et al.  UNIX Disk Access Patterns , 1993, USENIX Winter.

[7]  Jacob R. Lorch,et al.  A five-year study of file-system metadata , 2007, TOS.

[8]  Werner Vogels,et al.  File system usage in Windows NT 4.0 , 1999, SOSP.

[9]  Péter Orosz,et al.  Multi-threaded Packet Timestamping for End-to-End QoS Evaluation , 2013, ICSNC 2013.

[10]  Eric Anderson,et al.  Proceedings of the Third Usenix Conference on File and Storage Technologies Buttress: a Toolkit for Flexible and High Fidelity I/o Benchmarking , 2022 .

[11]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[12]  Margo I. Seltzer,et al.  NFS Tricks and Benchmarking Traps , 2003, USENIX Annual Technical Conference, FREENIX Track.

[13]  John A. Kunze,et al.  A trace-driven analysis of the UNIX 4.2 BSD file system , 1985, SOSP '85.

[14]  Nikolai Joukov,et al.  A nine year study of file system and storage benchmarking , 2008, TOS.

[15]  A. Matrawy,et al.  Bottleneck Analysis of Traffic Monitoring using Wireshark , 2007, 2007 Innovations in Information Technologies (IIT).

[16]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[17]  Margo I. Seltzer,et al.  Passive NFS Tracing of Email and Research Workloads , 2003, FAST.

[18]  Thomas E. Anderson,et al.  A Comparison of File System Workloads , 2000, USENIX Annual Technical Conference, General Track.

[19]  Shankar Pasupathy,et al.  Measurement and Analysis of Large-Scale Network File System Workloads , 2008, USENIX Annual Technical Conference.

[20]  Yanpei Chen,et al.  Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads , 2012, Proc. VLDB Endow..

[21]  K. K. Ramakrishnan,et al.  Analysis of file I/O traces in commercial computing environments , 1992, SIGMETRICS '92/PERFORMANCE '92.

[22]  Min Zhou,et al.  Analysis of personal computer workloads , 1999, MASCOTS '99. Proceedings of the Seventh International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.