BORG: Block-reORGanization and Self-optimization in Storage Systems

This paper presents the design, implementation, and evaluation of BORG, a self-optimizing storage system that performs automatic block reorganization based on the observed I/O workload. BORG is motivated by three characteristics of I/O workloads: non-uniform access frequency distribution, temporal locality, and partial determinism in non-sequential accesses. To achieve its objective, BORG manages a small, dedicated partition on the disk drive, with the goal of servicing a majority of the I/O requests from within this partition with significantly reduced seek and rotational delays. BORG is transparent to the rest of the storage stack, including applications, file system(s), and I/O schedulers, thereby requiring no or minimal modification to storage stack implementations. We evaluated a Linux implementation of BORG using several real-world workloads, including individual user desktop environments, a web-server, a virtual machine monitor, and an SVN server. These experiments comprehensively demonstrate BORG’s effectiveness in improving I/O performance and its incurred resource overhead.

[1]  R. S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[2]  M. Frans Kaashoek,et al.  Embedded Inodes and Explicit Grouping: Exploiting Disk Bandwidth for Small Files , 1997, USENIX Annual Technical Conference.

[3]  Chris Ruemmler,et al.  Disk Shuffling , 1991 .

[4]  Margo I. Seltzer,et al.  Disk Scheduling Revisited , 1990 .

[5]  Cyril U. Orji,et al.  Distorted mirrors , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[6]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[7]  Scott A. Brandt,et al.  Caching Files with a Program-based Last N Successors Model , 2001 .

[8]  John Wilkes,et al.  UNIX Disk Access Patterns , 1993, USENIX Winter.

[9]  Xiang Yu,et al.  Configuring and Scheduling an Eager-Writing Disk Array for a Transaction Processing Workload , 2002, FAST.

[10]  Alan Jay Smith,et al.  The performance impact of I/O optimizations and disk improvements , 2004, IBM J. Res. Dev..

[11]  Qing Yang,et al.  DCD --- Disk Caching Disk: A New Approach for Boosting I/O Performance , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[12]  Alan Jay Smith,et al.  The automatic improvement of locality in storage systems , 2005, TOCS.

[13]  Stephen C. Tweedie,et al.  Journaling the Linux ext2fs Filesystem , 2008 .

[14]  Andrea C. Arpaci-Dusseau,et al.  USENIX Annual Technical ConferenceUSENIX Association 297 Robust , Portable I / O Scheduling with the Disk Mimic , 2003 .

[15]  Yuanyuan Zhou,et al.  Association Proceedings of the Third USENIX Conference on File and Storage Technologies San Francisco , CA , USA March 31 – April 2 , 2004 , 2004 .

[16]  Michael L. Scott,et al.  Aggressive Prefetching: An Idea Whose Time Has Come , 2005, HotOS.

[17]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[18]  Kai Shen,et al.  Managing prefetch memory for data-intensive online servers , 2005, FAST'05.

[19]  Xiang Yu,et al.  Trading capacity for performance in a disk array , 2000, OSDI.

[20]  Medha Bhadkamkar,et al.  EXCES: External caching in energy saving storage systems , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[21]  Martin Pohlack,et al.  Rotational-position-aware real-time disk scheduling using a dynamic active subset (DAS) , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[22]  Gregory R. Ganger,et al.  Track-Aligned Extents: Matching Access Patterns to Disk Drive Characteristics , 2002, FAST.

[23]  Chak-Kuen Wong,et al.  Minimizing Expected Head Movement in One-Dimensional and Two-Dimensional Mass Storage Systems , 1980, CSUR.

[24]  Gregory R. Ganger,et al.  Freeblock Scheduling Outside of Disk Firmware , 2002, FAST.

[25]  Andrea C. Arpaci-Dusseau,et al.  Antfarm: Tracking Processes in a Virtual Machine Environment , 2006, USENIX Annual Technical Conference, General Track.

[26]  Scott D. Carson,et al.  A system for adaptive disk rearrangement , 1990, Softw. Pract. Exp..

[27]  Scott A. Brandt,et al.  Conserving Battery Energy through Making Fewer Incorrect File Predictions , 2001 .

[28]  Yiming Hu,et al.  DCD—disk caching disk: a new approach for boosting I/O performance , 1996, ISCA '96.

[29]  Jeanna Neefe Matthews,et al.  Improving the performance of log-structured file systems with adaptive methods , 1997, SOSP.

[30]  Craig A. N. Soules,et al.  A Two-Tiered Software Architecture for Automated Tuning of Disk Layouts (CMU-CS-03-130) , 2003 .

[31]  Kang G. Shin,et al.  FS2: dynamic data replication in free disk space for improving disk performance and energy consumption , 2005, SOSP '05.

[32]  Margo I. Seltzer,et al.  Self-monitoring and self-adapting operating systems , 1997, Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133).

[33]  Andrea C. Arpaci-Dusseau,et al.  Controlling Your PLACE in the File System with Gray-box Techniques , 2003, USENIX Annual Technical Conference, General Track.

[34]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[35]  Andrea C. Arpaci-Dusseau,et al.  Semantically-Smart Disk Systems , 2003, FAST.

[36]  Carl Staelin,et al.  Smart Filesystems , 1991, USENIX Winter.

[37]  Peter Druschel,et al.  Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O , 2001, SOSP.

[38]  Kenneth Salem,et al.  Adaptive block rearrangement , 1993, TOCS.

[39]  Gregory R. Ganger,et al.  Self-* Storage: Brick-based Storage with Automated Administration (CMU-CS-03-178) , 2003 .

[40]  Ahmed Amer,et al.  File access prediction with adjustable accuracy , 2002, Conference Proceedings of the IEEE International Performance, Computing, and Communications Conference (Cat. No.02CH37326).

[41]  Helen Custer,et al.  Inside the Windows NT File System , 1994 .

[42]  Carl Staelin,et al.  The HP AutoRAID hierarchical storage system , 1995, SOSP.

[43]  Harvey F. Silverman,et al.  Placement of Records on a Secondary Storage Device to Minimize Access Time , 1973, JACM.