Effective grouping for energy and performance: construction of adaptive, sustainable, and maintainable data storage

The performance gap between processors and storage systems has been increasingly critical over the years. Yet the performance disparity remains, and further, storage energy consumption is rapidly becoming a new critical problem. While smarter caching and predictive techniques do much to alleviate this disparity, the problem persists, and data storage remains a growing contributor to latency and energy consumption. Attempts have been made at data layout maintenance, or intelligent physical placement of data, yet in practice, basic heuristics remain predominant. Problems that early studies sought to solve via layout strategies were proven to be NP-Hard, and data layout maintenance today remains more art than science. With unknown potential and a domain inherently full of uncertainty, layout maintenance persists as an area largely untapped by modern systems. But uncertainty in workloads does not imply randomness; access patterns have exhibited repeatable, stable behavior. Predictive information can be gathered, analyzed, and exploited to improve data layouts. Our goal is a dynamic, robust, sustainable predictive engine, aimed at improving existing layouts by replicating data at the storage device level. We present a comprehensive discussion of the design and construction of such a predictive engine, including workload evaluation, where we present and evaluate classical workloads as well as our own highly detailed traces collected over an extended period. We demonstrate significant gains through an initial static grouping mechanism, and compare against an optimal grouping method of our own construction, and further show significant improvement over competing techniques. We also explore and illustrate the challenges faced when moving from static to dynamic (i.e. online) grouping, and provide motivation and solutions for addressing these challenges. These challenges include metadata storage, appropriate predictive collocation, online performance, and physical placement. We reduced the metadata needed by several orders of magnitude, reducing the required volume from more than 14% of total storage down to less than h%. We also demonstrate how our collocation strategies outperform competing techniques. Finally, we present our complete model and evaluate a prototype implementation against real hardware. This model was demonstrated to be capable of reducing device-level accesses by up to 65%. Keywords: computer systems, collocation, data management, file systems, grouping, metadata, modeling and prediction, operating systems, performance, power, secondary storage.

[1]  Karthick Rajamani,et al.  Energy Management for Commercial , 2003 .

[2]  Peter Druschel,et al.  Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O , 2001, SOSP.

[3]  Kenneth Salem,et al.  Adaptive block rearrangement , 1993, TOCS.

[4]  Peter J. Denning,et al.  Effects of scheduling on file memory operations , 1899, AFIPS '67 (Spring).

[5]  Ricardo Bianchini,et al.  Limiting the power consumption of main memory , 2007, ISCA '07.

[6]  Ahmed Amer,et al.  File access prediction with adjustable accuracy , 2002, Conference Proceedings of the IEEE International Performance, Computing, and Communications Conference (Cat. No.02CH37326).

[7]  Carl Staelin,et al.  Smart Filesystems , 1991, USENIX Winter.

[8]  Ahmed Amer,et al.  Avoiding state-space explosion of predictive metadata with SESH , 2009, 2009 IEEE 28th International Performance Computing and Communications Conference.

[9]  J. Spencer Love,et al.  Caching strategies to improve disk system performance , 1994, Computer.

[10]  Anastasia Ailamaki,et al.  Atropos: A Disk Array Volume Manager for Orchestrated Use of Disks , 2004, FAST.

[11]  John Wilkes,et al.  UNIX Disk Access Patterns , 1993, USENIX Winter.

[12]  Nimrod Megiddo,et al.  ARC: A Self-Tuning, Low Overhead Replacement Cache , 2003, FAST.

[13]  Randal C. Burns,et al.  Using multiple predictors to improve the accuracy of file access predictions , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[14]  Dharmendra S. Modha,et al.  SARC: Sequential Prefetching in Adaptive Replacement Cache , 2005, USENIX Annual Technical Conference, General Track.

[15]  Jim Griffioen,et al.  Reducing File System Latency using a Predictive Approach , 1994, USENIX Summer.

[16]  Samuel J. Leffler,et al.  A Fast File System for UNIX (Revised July 27, 1983) , 1983 .

[17]  Xiaoning Ding,et al.  DULO: an effective buffer cache management scheme to exploit both temporal and spatial locality , 2005, FAST'05.

[18]  Kang G. Shin,et al.  FS2: dynamic data replication in free disk space for improving disk performance and energy consumption , 2005, SOSP '05.

[19]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[20]  S. Gurumurthi,et al.  Using STEAM for Thermal Simulation of Storage Systems , 2006, IEEE Micro.

[21]  Ahmed Amer,et al.  Predictive data grouping: Defining the bounds of energy and latency reduction through predictive data grouping and replication , 2008, TOS.

[22]  Darrell D. E. Long,et al.  Adaptive disk spin‐down for mobile computers , 2000, Mob. Networks Appl..

[23]  Mahadev Satyanarayanan,et al.  Using dynamic sets to reduce the aggregate latency of data access , 1997 .

[24]  Margo I. Seltzer,et al.  Disk Scheduling Revisited , 1990 .

[25]  Dan Duchamp,et al.  Prefetching Hyperlinks , 1999, USENIX Symposium on Internet Technologies and Systems.

[26]  John Wilkes Predictive power conservation , 2003 .

[27]  Christopher Chute,et al.  The Diverse and Exploding Digital Universe , 2011 .

[28]  Yuhui Deng,et al.  Exploiting the performance gains of modern disk drives by enhancing data locality , 2009, Inf. Sci..

[29]  Timothy J. Gibson,et al.  An Improved Long-Term File Usage Prediction Algorithm , 1999, Int. CMG Conference.

[30]  Darrell D. E. Long,et al.  A dynamic disk spin-down technique for mobile computing , 1996, MobiCom '96.

[31]  Philip M. Long,et al.  Learning to Make Rent-to-Buy Decisions with Systems Applications , 1995, ICML.

[32]  Carl Staelin,et al.  Idleness is Not Sloth , 1995, USENIX.

[33]  Thomas F. Wenisch,et al.  PowerNap: eliminating server idle power , 2009, ASPLOS.

[34]  Jun Wang,et al.  High performance energy efficient file storage system , 2006 .

[35]  Anand Sivasubramaniam,et al.  A Hybrid Cache and Prefetch Mechanism for Scientific Literature Search Engines , 2007, ICWE.

[36]  Hui Lei,et al.  An analytical approach to file prefetching , 1997 .

[37]  Randal C. Burns,et al.  Group-based management of distributed file caches , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[38]  Christopher Small,et al.  Why does file system prefetching work? , 1999, USENIX Annual Technical Conference, General Track.

[39]  P. Krishnan,et al.  Thwarting the Power-Hungry Disk , 1994, USENIX Winter.

[40]  Stephen C. Tweedie,et al.  Journaling the Linux ext2fs Filesystem , 2008 .

[41]  Alexander P. Pons Web-application centric object prefetching , 2003, J. Syst. Softw..

[42]  Darrell D. E. Long,et al.  The case for efficient file access pattern modeling , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[43]  Shankar Pasupathy,et al.  Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems , 2009, FAST.

[44]  Chak-Kuen Wong,et al.  Algorithmic Studies in Mass Storage Systems , 1983, Springer Berlin Heidelberg.

[45]  Paul M. Greenawalt Modeling power management for hard disks , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[46]  Frank Bellosa,et al.  Cooperative I / O-- A Novel I / O Semantics for Energy-Aware Applications , 2003 .

[47]  J. Flinn,et al.  Energy-aware adaptation for mobile applications , 1999, SOSP.

[48]  Alan Gilbert Merten,et al.  Some quantitative techniques for file organization , 1970 .

[49]  George Forman,et al.  Cool Job Allocation: Measuring the Power Savings of Placing Jobs at Cooling-Efficient Locations in the Data Center , 2007, USENIX Annual Technical Conference.

[50]  Scott A. Brandt,et al.  Performing file prediction with a program-based successor model , 2001, MASCOTS 2001, Proceedings Ninth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[51]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[52]  Martin Pohlack,et al.  Rotational-position-aware real-time disk scheduling using a dynamic active subset (DAS) , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[53]  DiskPerformanceCarl StaelinHector Garcia-MolinaDepartment Clustering Active Disk Data to Improve , 1990 .

[54]  Ahmed Amer,et al.  Predictive reduction of power and latency (PuRPLe) , 2005, 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST'05).

[55]  Xiaoning Ding,et al.  DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch , 2007, USENIX Annual Technical Conference.

[56]  Karthick Rajamani,et al.  Energy Management for Commercial Servers , 2003, Computer.

[57]  Anand Sivasubramaniam,et al.  Understanding the performance-temperature interactions in disk I/O of server workloads , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[58]  Shankar Pasupathy,et al.  High-performance metadata indexing and search in petascale data storage systems , 2008 .

[59]  Vagelis Hristidis,et al.  BORG: Block-reORGanization for Self-optimizing Storage Systems , 2009, FAST.

[60]  John Wilkes,et al.  Disk scheduling algorithms based on rotational position , 1991 .

[61]  Thomas E. Anderson,et al.  A Comparison of File System Workloads , 2000, USENIX Annual Technical Conference, General Track.

[62]  Philip M. Long,et al.  Adaptive Disk Spindown via Optimal Rent-to-Buy in Probabilistic Environments , 1999, Algorithmica.

[63]  P. Krishnan,et al.  Optimal prefetching via data compression , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[64]  Erez Zadok,et al.  Energy and performance evaluation of lossless file data compression on server systems , 2009, SYSTOR '09.

[65]  Anand Sivasubramaniam,et al.  Disk drive roadmap from the thermal perspective: a case for dynamic thermal management , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[66]  Fred Douglis,et al.  Adaptive Disk Spin-Down Policies for Mobile Computers , 1995, Comput. Syst..

[67]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine-mediated learning.

[68]  Darrell D. E. Long,et al.  Predictive data grouping using successor prediction , 2002 .

[69]  Paul S. Fisher,et al.  FI-based file access predictor , 2009, ACM-SE 47.

[70]  P. Krishnan Online prediction algorithms for databases and operating systems , 1996 .

[71]  Darrell D. E. Long,et al.  Exploring the Bounds of Web Latency Reduction from Caching and Prefetching , 1997, USENIX Symposium on Internet Technologies and Systems.

[72]  P. Krishnan,et al.  Practical prefetching via data compression , 1993 .

[73]  Ahmed Amer,et al.  STEP: Self-Tuning Energy-safe Predictors , 2005, MDM '05.

[74]  Gregory R. Ganger,et al.  Track-Aligned Extents: Matching Access Patterns to Disk Drive Characteristics , 2002, FAST.

[75]  Nikolai Joukov,et al.  A nine year study of file system and storage benchmarking , 2008, TOS.

[76]  Suresh Singh,et al.  Applying models of user activity for dynamic power management in wireless devices , 2008, Mobile HCI.

[77]  Dong Li,et al.  A performance-oriented energy efficient file system , 2004, SNAPI '04.

[78]  Gregory R. Ganger,et al.  The DiskSim Simulation Environment Version 4.0 Reference Manual (CMU-PDL-08-101) , 1998 .

[79]  Arif Merchant,et al.  TaP: Table-based Prefetching for Storage Caches , 2008, FAST.

[80]  Mahmut T. Kandemir,et al.  Improving disk reuse for reducing power consumption , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[81]  Nimrod Megiddo,et al.  Outperforming LRU with an adaptive replacement cache algorithm , 2004, Computer.

[82]  J. C. Browne,et al.  Trace driven modeling: Review and overview , 1973, ANSS '73.

[83]  Darrell D. E. Long,et al.  Noah: low-cost file access prediction through pairs , 2001, Conference Proceedings of the 2001 IEEE International Performance, Computing, and Communications Conference (Cat. No.01CH37210).

[84]  Eran Gabber,et al.  Storage Management for Web Proxies , 2001, USENIX Annual Technical Conference, General Track.

[85]  M. Frans Kaashoek,et al.  Embedded Inodes and Explicit Grouping: Exploiting Disk Bandwidth for Small Files , 1997, USENIX Annual Technical Conference.

[86]  Sang Lyul Min,et al.  LRFU (Least Recently/Frequently Used) Replacement Policy: A Spectrum of Block Replacement Policies , 1996 .

[87]  Jeffrey C. Mogul,et al.  Using predictive prefetching to improve World Wide Web latency , 1996, CCRV.

[88]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[89]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[90]  Michael L. Scott,et al.  Energy efficient prefetching and caching , 2004 .

[91]  Darrell D. E. Long,et al.  Swift: Using Distributed Disk Striping to Provide High I/O Data Rates , 1991, Comput. Syst..

[92]  Chris Ruemmler,et al.  Disk Shuffling , 1991 .

[93]  Carl Staelin,et al.  File system design using large memories , 1990, Proceedings of the 5th Jerusalem Conference on Information Technology, 1990. 'Next Decade in Information Technology'.

[94]  Arvind Krishnamurthy,et al.  Modeling Hard-Disk Power Consumption , 2003, FAST.

[95]  Margo I. Seltzer,et al.  A Comparison of FFS Disk Allocation Policies , 1996, USENIX Annual Technical Conference.

[96]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[97]  Yuanyuan Zhou,et al.  Power-aware storage cache management , 2005, IEEE Transactions on Computers.

[98]  Alan Jay Smith,et al.  Software strategies for portable computer energy management , 1998, IEEE Wirel. Commun..

[99]  Helen Custer,et al.  Inside the Windows NT File System , 1994 .

[100]  Mahadev Satyanarayanan,et al.  Long Term Distributed File Reference Tracing: Implementation and Experience , 1996, Softw. Pract. Exp..

[101]  Nikolai Joukov,et al.  GreenFS: making enterprise computers greener by protecting them better , 2008, Eurosys '08.

[102]  Chris Gniady,et al.  Context-Aware Mechanisms for Reducing Interactive Delays of Energy Management in Disks , 2008, USENIX Annual Technical Conference.

[103]  Scott A. Brandt,et al.  ACME: Adaptive Caching Using Multiple Experts , 2002, WDAS.

[104]  Darrell D. E. Long,et al.  Predicting Future File-System Actions From Prior Events , 1996, USENIX Annual Technical Conference.

[105]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[106]  Sung Hoon Baek,et al.  Prefetching with Adaptive Cache Culling for Striped Disk Arrays , 2008, USENIX Annual Technical Conference.

[107]  Robert Sedgewick,et al.  Algorithms in C , 1990 .

[108]  Prashant J. Shenoy,et al.  Rules of thumb in data engineering , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[109]  Michael L. Scott,et al.  Energy efficiency through burstiness , 2003, 2003 Proceedings Fifth IEEE Workshop on Mobile Computing Systems and Applications.

[110]  Ahmed Amer,et al.  A stochastic approach to file access prediction , 2003, SNAPI '03.

[111]  Philip H. Seaman,et al.  On Teleprocessing System Design Part IV: An Analysis of Auxiliary Storage Activity , 1966, IBM Syst. J..

[112]  Ahmed Amer,et al.  Visualizing cache effects on I/O workload predictability , 2003, Conference Proceedings of the 2003 IEEE International Performance, Computing, and Communications Conference, 2003..

[113]  David Essary,et al.  Space-Efficient Predictive Block Management , 2009 .

[114]  Michael Kistler,et al.  The case for power management in web servers , 2002 .

[115]  Darrell D. E. Long,et al.  Design and Implementation of a Predictive File Prefetching Algorithm , 2001, USENIX Annual Technical Conference, General Track.