Towards higher disk head utilization: extracting free bandwidth from busy disk drives

Freeblock scheduling is a new approach to utilizing more of a disk's potential media bandwidth. By filling rotational latency periods with useful media transfers, 20-50% of a never-idle disk's bandwidth can often be provided to background applications with no effect on foreground response times. This paper describes freeblock scheduling and demonstrates its value with simulation studies of two concrete applications: segment cleaning and data mining. Free segment cleaning often allows an LFS file system to maintain its ideal write performance when cleaning overheads would otherwise reduce performance by up to a factor of three. Free data mining can achieve over 47 full disk scans per day on an active transaction processing system, with no effect on its disk performance.

[1]  Philip H. Seaman,et al.  On Teleprocessing System Design Part IV: An Analysis of Auxiliary Storage Activity , 1966, IBM Syst. J..

[2]  Robert S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[3]  Doron Rotem,et al.  Simple Random Sampling from Relational Databases , 1986, VLDB.

[4]  Scott D. Carson,et al.  A system for adaptive disk rearrangement , 1990, Softw. Pract. Exp..

[5]  Margo I. Seltzer,et al.  Disk Scheduling Revisited , 1990 .

[6]  John Wilkes,et al.  Disk scheduling algorithms based on rotational position , 1991 .

[7]  Thomas R. Gross,et al.  Combining the concepts of compression and caching for a two-level filesystem , 1991, ASPLOS IV.

[8]  Spencer W. Ng,et al.  Improving Disk Performance Via Latency Reduction , 1991, IEEE Trans. Computers.

[9]  Raphael A. Finkel,et al.  An ASCII Database for Fast Queries of Relatively Stable Data , 1991, Comput. Syst..

[10]  Raphael A. Finkel,et al.  An implementation of service rebalancing , 1992 .

[11]  Sanjeev Setia,et al.  Analysis of the Periodic Update Write Policy For Disk Cache , 1990, IEEE Trans. Software Eng..

[12]  Mary Baker,et al.  Non-volatile memory for fast, reliable file systems , 1992, ASPLOS V.

[13]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[14]  Jai Menon,et al.  Floating Parity and Data Disk Arrays , 1993, J. Parallel Distributed Comput..

[15]  Bruce McNutt Background Data Movement in a Log-Structured Disk Subsystem , 1993, IBM J. Res. Dev..

[16]  K. K. Ramakrishnan,et al.  Trace driven analysis of write caching policies for disks , 1993, SIGMETRICS '93.

[17]  Wilson C. Hsieh,et al.  The logical disk: a new approach to improving file systems , 1994, SOSP '93.

[18]  Eugene H. Spafford,et al.  The design and implementation of tripwire: a file system integrity checker , 1994, CCS '94.

[19]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[20]  Jim Griffioen,et al.  Reducing File System Latency using a Predictive Approach , 1994, USENIX Summer.

[21]  Sara McMains,et al.  File System Logging versus Clustering: A Performance Comparison , 1995, USENIX.

[22]  Jennifer Widom,et al.  Research problems in data warehousing , 1995, CIKM '95.

[23]  Margo I. Seltzer,et al.  Heuristic Cleaning Algorithms in Log-Structured File Systems , 1995, USENIX.

[24]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[25]  Kenneth Salem,et al.  Adaptive block rearrangement , 1993, TOCS.

[26]  Carl Staelin,et al.  Idleness is Not Sloth , 1995, USENIX.

[27]  Richard A. Golding,et al.  The HP AutoRAID hierarchical storage system , 1996, TOCS.

[28]  Divyakant Agrawal,et al.  Using Reconfiguration for Efficient Management of Replicated Data , 1996, IEEE Trans. Knowl. Data Eng..

[29]  Anna R. Karlin,et al.  Implementation and performance of integrated application-controlled file caching, prefetching, and disk scheduling , 1996, TOCS.

[30]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[31]  Jeffrey Katcher,et al.  PostMark: A New File System Benchmark , 1997 .

[32]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[33]  Ragunathan Rajkumar,et al.  Real-time filesystems. Guaranteeing timing constraints for disk accesses in RT-Mach , 1997, Proceedings Real-Time Systems Symposium.

[34]  Paul Barham,et al.  A fresh approach to file system quality of service , 1997, Proceedings of 7th International Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV '97).

[35]  Jeanna Neefe Matthews,et al.  Improving the performance of log-structured file systems with adaptive methods , 1997, SOSP.

[36]  David C. Steere Exploiting Non-Determinism in Set Iterators to Reduce I/O Latency , 1997 .

[37]  David C. Steere,et al.  Exploiting the non-determinism and asynchrony of set iterators to reduce aggregate file I/O latency , 1997, SOSP.

[38]  David A. Patterson,et al.  A case for intelligent disks (IDISKs) , 1998, SGMD.

[39]  Gregory R. Ganger,et al.  The DiskSim Simulation Environment Version 4.0 Reference Manual (CMU-PDL-08-101) , 1998 .

[40]  Christos Faloutsos,et al.  Active Storage for Large-Scale Data Mining and Multimedia , 1998, VLDB.

[41]  Christos Faloutsos,et al.  Ratio Rules: A New Paradigm for Fast, Quantifiable Data Mining , 1998, VLDB.

[42]  Prashant J. Shenoy,et al.  Cello: A Disk Scheduling Framework for Next Generation Operating Systems* , 1998, SIGMETRICS '98/PERFORMANCE '98.

[43]  Usama Fayyad Taming the Giants and the Monsters: Mining Large Databases for Nuggets of Knowledge , 1998 .

[44]  Jim Zelenka,et al.  High-bandwidth storage architecture , 1998, ASPLOS 1998.

[45]  Rajeev Motwani,et al.  Random sampling for histogram construction: how much is enough? , 1998, SIGMOD '98.

[46]  Yale N. Patt,et al.  Using System-Level Models to Evaluate I/O Subsystem Designs , 1998, IEEE Trans. Computers.

[47]  A. Chervenak,et al.  Protecting File Systems : A Survey of Backup Techniques , 1998 .

[48]  Jim Zelenka,et al.  A cost-effective, high-bandwidth storage architecture , 1998, ASPLOS VIII.

[49]  Darrell D. E. Long,et al.  The case for efficient file access pattern modeling , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[50]  Banu Özden,et al.  Disk scheduling with quality of service guarantees , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[51]  Christopher Small,et al.  Why does file system prefetching work? , 1999, USENIX Annual Technical Conference, General Track.

[52]  Norman C. Hutchinson,et al.  Logical vs. physical file system backup , 1999, OSDI '99.

[53]  David A. Patterson,et al.  Virtual log based file systems for a programmable disk , 1999, OSDI '99.

[54]  Xiang Yu,et al.  Trading capacity for performance in a disk array , 2000, OSDI.

[55]  Christos Faloutsos,et al.  Data mining on an OLTP system (nearly) for free , 2000, SIGMOD '00.

[56]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[57]  Richard F. Lary,et al.  Scheduling for Modern Disk Drives and Non-Random Workloads , 2001 .

[58]  Hai Jin,et al.  Active Disks: Programming Model, Algorithms and Evaluation , 2002 .