Effects on performance and energy reduction by file relocation based on file-access correlations

Many recent advanced applications of information systems require large amounts of data to be stored in multiple hard disk drives (HDDs). The placement of data in these drives is quite influential in reducing energy consumption while retaining the performance of the system. Several approaches have been proposed in which frequently accessed files are concentrated into a limited number of drives to enable spin-down of the other drives. However, the placement of infrequently accessed data in these drives is also significant because energy consumption during spin-up to access these data cannot be ignored, particularly when files tending to be used together are placed on many different drives. In this paper, we propose a novel method named PLECO (Placement of files for Latency and Energy Consumption Optimization), which aims to place correlated files into the same drive to reduce power consumption and improve the performance of the system. We simulated and evaluated the proposed method, and these results indicate that our proposal can reduce both the energy consumption and the access latency by up to 32% and 92%, respectively, compared with a baseline system.

[1]  David Vengerov,et al.  A reinforcement learning framework for online data migration in hierarchical storage systems , 2007, The Journal of Supercomputing.

[2]  Dirk Grunwald,et al.  Massive Arrays of Idle Disks For Storage Archives , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[3]  Hong Jiang,et al.  FARMER: A novel approach to file access correlation mining and evaluation reference model , 2008, HPDC '08.

[4]  Darrell D. E. Long,et al.  Noah: low-cost file access prediction through pairs , 2001, Conference Proceedings of the 2001 IEEE International Performance, Computing, and Communications Conference (Cat. No.01CH37210).

[5]  Jin Qian,et al.  PARAID: A gear-shifting power-aware RAID , 2007, TOS.

[6]  Paolo Ferragina,et al.  Text Compression , 2009, Encyclopedia of Database Systems.

[7]  Margo I. Seltzer,et al.  Passive NFS Tracing of Email and Research Workloads , 2003, FAST.

[8]  Haruo Yokota,et al.  Data Allocation Based on XML Query Patterns to Reduce Power Consumption , 2011, 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing.

[9]  Dong Li,et al.  EERAID: energy efficient redundant and inexpensive disk array , 2004, EW 11.

[10]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[11]  James Griffioen Randy Appleton Performance Measurements of Automatic Prefetching , 1995 .

[12]  Yuanyuan Zhou,et al.  Hibernator: helping disk arrays sleep through the winter , 2005, SOSP '05.

[13]  Yi Wu,et al.  A File Search Method Based on Intertask Relationships Derived from Access Frequency and RMC Operations on Files , 2011, DEXA.

[14]  Austin Donnelly,et al.  Sierra: a power-proportional, distributed storage system , 2009 .

[15]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[16]  Richard E. Brown,et al.  Report to Congress on Server and Data Center Energy Efficiency: Public Law 109-431 , 2008 .

[17]  Haruo Yokota,et al.  A Power Saving Storage Method That Considers Individual Disk Rotation , 2012, DASFAA.

[18]  Karsten Schwan,et al.  Robust and flexible power-proportional storage , 2010, SoCC '10.

[19]  Hong Jiang,et al.  GRAID: A Green RAID Storage Architecture with Improved Energy Efficiency and Reliability , 2008, 2008 IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems.

[20]  Darrell D. E. Long,et al.  The case for efficient file access pattern modeling , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[21]  Rudolf Bayer,et al.  Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[22]  Mahmut T. Kandemir,et al.  DRPM: dynamic speed control for power management in server class disks , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..