GreenDM: A Versatile Tiering Hybrid Drive for the Trade-Off Evaluation of Performance, Energy, and Endurance

of the Dissertation GreenDM: A Versatile Tiering Hybrid Drive for the Trade-Off Evaluation of Performance, Energy, and Endurance by Zhichao Li for the Degree of Doctor of Philosophy in Computer Science Stony Brook University 2014 There are trade-offs among performance, energy, and device endurance for storage systems. These trade-offs become more complex in storage system comprising different storage technologies. Designs optimized for one dimension or workload often suffer in another. Therefore, it is important to study the trade-offs so as to be able to adapt the system to workloads. As different types of drives have different traits, tiering hybrid drives are studied more closely. However, previous tiering hybrids are often designed for high throughput, efficient energy consumption, or improving endurance—leaving empirical study on the trade-offs being unexplored. Past endurance studies also lack a concrete model and metric to help study the trade-offs. Lastly, previous designs are often based on inflexible policies that cannot adapt easily to changing conditions. We designed and developed GreenDM, a versatile tiering hybrid drive that combines Flashbased SSDs with traditional HDDs; we present our endurance model to study the aforementioned trade-offs. GreenDM presents a block interface and requires no modifications to existing application software. GreenDM migrates hot data to the faster SSD and cold data to the slower HDD. GreenDM offers tunable parameters useful in adapting the system to many workloads. We have designed, developed, and carefully evaluated GreenDM with a variety of workloads using commodity SSD and HDD drives. We demonstrated the importance of versatility to be able to adapt to various workloads. Our thesis is that one must study the trade-offs among performance, energy, and endurance, especially in the ever more popular tiered storage systems, to enable adaptation to workloads. Our system is versatile so that it can adapt to different workloads to achieve certain trade-offs by adjusting the important system parameters. We also provide several interesting observations along the cost dimension. We developed a cost model for GreenDM and evaluated it under realistic cost metrics. Future storage system designs have to consider multiple optimizations dimensions: performance, energy, endurance, and dollar cost. ii We close with several interesting long-term future research. First, it will be interesting to provide automated control knobs for users to trade-off performance, energy efficiency, and endurance. Second, one could extend the two-tier system to three tiers and explore more tiering policies. Third, it would be useful to add security as an additional dimension to further explore these trade-offs. Forth, one could experiment with different storage devices and policies in the future, and help build more efficient storage systems to achieve high performance at minimum cost. Fifth and last, it would be interesting to provide control support at the CPU level as well to further justify the trade-offs among performance, energy, and endurance.

[1]  Rini T. Kaushik,et al.  GreenHDFS: towards an energy-conserving, storage-efficient, hybrid Hadoop compute cluster , 2010 .

[2]  Jin Qian,et al.  PARAID: A gear-shifting power-aware RAID , 2007, TOS.

[3]  Adam Meyerson,et al.  Approximation algorithms for deadline-TSP and vehicle routing with time-windows , 2004, STOC '04.

[4]  Assim Sagahyroon Power Consumption in Handheld Computers , 2006, APCCAS 2006 - 2006 IEEE Asia Pacific Conference on Circuits and Systems.

[5]  Feng-Bin Sun,et al.  A comprehensive review of hard-disk drive reliability , 1999, Annual Reliability and Maintainability. Symposium. 1999 Proceedings (Cat. No.99CH36283).

[6]  Bianca Schroeder,et al.  Cosmic rays don't strike twice: understanding the nature of DRAM errors and the implications for system design , 2012, ASPLOS XVII.

[7]  Ethan L. Miller,et al.  Semantic data placement for power management in archival storage , 2010, 2010 5th Petascale Data Storage Workshop (PDSW '10).

[8]  Mahmut T. Kandemir,et al.  Scheduler-based DRAM energy management , 2002, DAC '02.

[9]  Eduardo Pinheiro,et al.  DRAM errors in the wild: a large-scale field study , 2009, SIGMETRICS '09.

[10]  Zhichao Li,et al.  Power consumption in enterprise-scale backup storage systems , 2012, FAST.

[11]  Dipankar Sarma,et al.  Energy-aware task and interrupt management in Linux , 2009 .

[12]  Philip Levis,et al.  Policies for dynamic clock scheduling , 2000, OSDI.

[13]  Yao Sun,et al.  PEARL: Performance, Energy, and Reliability Balanced Dynamic Data Redistribution for Next Generation Disk Arrays , 2008, 2008 IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems.

[14]  Dirk Grunwald,et al.  Massive Arrays of Idle Disks For Storage Archives , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[15]  Margaret Martonosi,et al.  Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[16]  Yifeng Zhu,et al.  Energy and thermal aware buffer cache replacement algorithm , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[17]  Erez Zadok,et al.  Evaluating Performance and Energy in File System Server Workloads , 2010, FAST.

[18]  Zvonimir Bandic,et al.  Indirection systems for shingled-recording disk drives , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[19]  Mircea R. Stan,et al.  How I Learned to Stop Worrying and Love Flash Endurance , 2010, HotStorage.

[20]  Steven Swanson,et al.  Reliably Erasing Data from Flash-Based Solid State Drives , 2011, FAST.

[21]  Yuanyuan Zhou,et al.  Hibernator: helping disk arrays sleep through the winter , 2005, SOSP '05.

[22]  Kang G. Shin,et al.  What does control theory bring to systems research? , 2009, OPSR.

[23]  Christopher Frost,et al.  Better I/O through byte-addressable, persistent memory , 2009, SOSP '09.

[24]  Eduardo Pinheiro,et al.  Failure Trends in a Large Disk Drive Population , 2007, FAST.

[25]  Anand Sivasubramaniam,et al.  HybridStore: A Cost-Efficient, High-Performance Storage System Combining SSDs and HDDs , 2011, 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems.

[26]  Kang G. Shin,et al.  FS2: dynamic data replication in free disk space for improving disk performance and energy consumption , 2005, SOSP '05.

[27]  Kimiko O. Bowman,et al.  Gamma Distribution , 2011, International Encyclopedia of Statistical Science.

[28]  Joseph F. Murray,et al.  Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application , 2005, J. Mach. Learn. Res..

[29]  Ahmed Amer,et al.  Predictive data grouping: Defining the bounds of energy and latency reduction through predictive data grouping and replication , 2008, TOS.

[30]  Lennart Ljung,et al.  System identification (2nd ed.): theory for the user , 1999 .

[31]  Greg Hamerly,et al.  Bayesian approaches to failure prediction for disk drives , 2001, ICML.

[32]  Bianca Schroeder,et al.  Understanding latent sector errors and how to protect against them , 2010, TOS.

[33]  Lizy Kurian John,et al.  Analysis of dynamic power management on multi-core processors , 2008, ICS '08.

[34]  Mahmut T. Kandemir,et al.  Revisiting widely held SSD expectations and rethinking system-level implications , 2013, SIGMETRICS '13.

[35]  Zhichao Li,et al.  Model discovery for energy-aware computing systems: An experimental evaluation , 2011, 2011 International Green Computing Conference and Workshops.

[36]  Erez Zadok,et al.  Benchmarking File System Benchmarking: It *IS* Rocket Science , 2011, HotOS.

[37]  Hyeonsang Eom,et al.  NCQ vs. I/O scheduler: Preventing unexpected misbehaviors , 2010, TOS.

[38]  Navendu Jain,et al.  Managing cost, performance, and reliability tradeoffs for energy-aware server provisioning , 2011, 2011 Proceedings IEEE INFOCOM.

[39]  Karsten Schwan,et al.  Robust and flexible power-proportional storage , 2010, SoCC '10.

[40]  Zhichao Li,et al.  On the energy consumption and performance of systems software , 2011, SYSTOR '11.

[41]  Gernot Heiser,et al.  An Analysis of Power Consumption in a Smartphone , 2010, USENIX Annual Technical Conference.

[42]  Eric R. Ziegel,et al.  Engineering Statistics , 2004, Technometrics.

[43]  Ruhi Sarikaya,et al.  Runtime workload behavior prediction using statistical metric modeling with application to dynamic power management , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).

[44]  Frank Bellosa,et al.  The benefits of event: driven energy accounting in power-sensitive systems , 2000, ACM SIGOPS European Workshop.

[45]  D. Rosenthal,et al.  The Economics of Long-Term Digital Storage , 2012 .

[46]  Margo I. Seltzer,et al.  Operating system benchmarking in the wake of lmbench: a case study of the performance of NetBSD on the Intel x86 architecture , 1997, SIGMETRICS '97.

[47]  Andrea C. Arpaci-Dusseau,et al.  An analysis of data corruption in the storage stack , 2008, TOS.

[48]  James S. Plank,et al.  Mean Time to Meaningless: MTTDL, Markov Models, and Storage System Reliability , 2010, HotStorage.

[49]  Bianca Schroeder,et al.  A Large-Scale Study of Failures in High-Performance Computing Systems , 2006, IEEE Transactions on Dependable and Secure Computing.

[50]  Yuanyuan Zhou,et al.  Reducing Energy Consumption of Disk Storage Using Power-Aware Cache Management , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[51]  Jin-Soo Kim,et al.  BEST: Best-effort energy saving techniques for NAND flash-based hybrid storage , 2012, IEEE Transactions on Consumer Electronics.

[52]  Gernot Heiser,et al.  Slow Down or Sleep, That Is the Question , 2011, USENIX Annual Technical Conference.

[53]  Feng Chen,et al.  Hystor: making the best use of solid state drives in high performance storage systems , 2011, ICS '11.

[54]  Scott A. Brandt,et al.  A Hybrid Disk-Aware Spin-Down Algorithm with I/O Subsystem Support , 2007, 2007 IEEE International Performance, Computing, and Communications Conference.

[55]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[56]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[57]  Erez Zadok,et al.  Energy and performance evaluation of lossless file data compression on server systems , 2009, SYSTOR '09.

[58]  Sungjin Lee,et al.  Lifetime management of flash-based SSDs using recovery-aware dynamic throttling , 2012, FAST.

[59]  S. Shah,et al.  Server class disk drives: how reliable are they? , 2004, Annual Symposium Reliability and Maintainability, 2004 - RAMS.

[60]  Micha Hofri Disk scheduling: FCFS vs.SSTF revisited , 1980, CACM.

[61]  Youyou Lu,et al.  Extending the lifetime of flash-based storage through reducing write amplification from file systems , 2013, FAST.

[62]  Michael M. Swift,et al.  FlashTier: a lightweight, consistent and durable storage cache , 2012, EuroSys '12.

[63]  Xiaohui Gu,et al.  On Predictability of System Anomalies in Real World , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[64]  Himabindu Pucha,et al.  Cost Effective Storage using Extent Based Dynamic Tiering , 2011, FAST.

[65]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[66]  Richard P. King,et al.  Disk arm movement in anticipation of future requests , 1990, TOCS.

[67]  Arkady Kanevsky,et al.  Are disks the dominant contributor for storage failures?: A comprehensive study of storage subsystem failure characteristics , 2008, TOS.

[68]  Tajana Simunic,et al.  System-Level Power Management Using Online Learning , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[69]  Mahmut T. Kandemir,et al.  DRPM: dynamic speed control for power management in server class disks , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[70]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[71]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[72]  Ethan L. Miller,et al.  Pergamum: Replacing Tape with Energy Efficient, Reliable, Disk-Based Archival Storage , 2008, FAST.

[73]  Hyojun Kim,et al.  Evaluating Phase Change Memory for Enterprise Storage Systems: A Study of Caching and Tiering Approaches , 2014, TOS.

[74]  Akshat Verma,et al.  SRCMap: Energy Proportional Storage Using Dynamic Consolidation , 2010, FAST.

[75]  Richard F. Freitas,et al.  Storage class memory: Technology, systems and applications , 2009, 2010 IEEE Hot Chips 22 Symposium (HCS).

[76]  Trevor Mudge,et al.  Combined dynamic voltage scaling and adaptive body biasing for lower power microprocessors under dynamic workloads , 2002, ICCAD 2002.

[77]  Mahesh Balakrishnan,et al.  Extending SSD Lifetimes with Disk-Based Write Caches , 2010, FAST.

[78]  Tian Luo,et al.  hStorage-DB: Heterogeneity-aware Data Management to Exploit the Full Capability of Hybrid Storage Systems , 2012, Proc. VLDB Endow..

[79]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[80]  Austin Donnelly,et al.  Sierra: a power-proportional, distributed storage system , 2009 .

[81]  Nikolai Joukov,et al.  Operating system profiling via latency analysis , 2006, OSDI '06.

[82]  Erez Zadok,et al.  vATM: vSphere Adaptive Task Management , 2012 .

[83]  Thomas A. Henzinger,et al.  HYTECH: the next generation , 1995, Proceedings 16th IEEE Real-Time Systems Symposium.

[84]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[85]  Rajeev Alur,et al.  A Theory of Timed Automata , 1994, Theor. Comput. Sci..

[86]  J. Koomey Worldwide electricity used in data centers , 2008 .

[87]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[88]  Jacob R. Lorch,et al.  A complete picture of the energy consumption of a portable computer , 1995 .

[89]  Bianca Schroeder,et al.  Understanding failures in petascale computers , 2007 .

[90]  Ragunathan Rajkumar,et al.  Critical power slope: understanding the runtime effects of frequency scaling , 2002, ICS '02.

[91]  John D. Strunk Hybrid aggregates: combining SSDs and HDDs in a single storage pool , 2012, OPSR.

[92]  Antony I. T. Rowstron,et al.  Migrating server storage to SSDs: analysis of tradeoffs , 2009, EuroSys '09.

[93]  Nikolai Joukov,et al.  GreenFS: making enterprise computers greener by protecting them better , 2008, Eurosys '08.

[94]  Amip J. Shah,et al.  Green server design: beyond operational energy to sustainability , 2010 .

[95]  Stratis Viglas,et al.  Flashing up the storage layer , 2008, Proc. VLDB Endow..

[96]  Nikolai Joukov,et al.  Auto-pilot: A Platform for System Software Benchmarking , 2005, USENIX Annual Technical Conference, FREENIX Track.

[97]  Peter Desnoyers,et al.  What Systems Researchers Need to Know about NAND Flash , 2013, HotStorage.

[98]  Mithuna Thottethodi,et al.  SieveStore: a highly-selective, ensemble-level disk cache for cost-performance , 2010, ISCA '10.

[99]  Karan Gupta,et al.  Energy proportionality for storage: impact and feasibility , 2010, OPSR.

[100]  Ling Liu,et al.  Cura: A Cost-Optimized Model for MapReduce in a Cloud , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[101]  Michael Kistler,et al.  The case for power management in web servers , 2002 .

[102]  Tajana Simunic,et al.  Evaluating the impact of job scheduling and power management on processor lifetime for chip multiprocessors , 2009, SIGMETRICS '09.

[103]  Lizy Kurian John,et al.  Complete System Power Estimation: A Trickle-Down Approach Based on Performance Events , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[104]  Vibhore Vardhan,et al.  Power Consumption Breakdown on a Modern Laptop , 2004, PACS.

[105]  Frank Mueller,et al.  Feedback EDF scheduling exploiting dynamic voltage scaling , 2004, Proceedings. RTAS 2004. 10th IEEE Real-Time and Embedded Technology and Applications Symposium, 2004..

[106]  Fred Douglis,et al.  Characteristics of backup workloads in production systems , 2012, FAST.