Dynamic file allocation in disk arrays

Large arrays of small disks are being considered as a promising approach to high performance 1/0 architectures. In this paper we deal with the problem of data placement in such a disk array. The prevalent approach is to decluster large files across a number of disks so as to minimize the access time to a file and balance the 1/0 load across the disks. The data placement problem entails determining the number of disks and the set of disks across which a file is declustered. Unlike previous work, this paper does not assume that all files are allocated at the same time but rather considers dynamic file creations, This makes the placement problem considerably harder because each placement decision has to take into account the current allocation state and the access frequencies of the disks and the existing files. As a result, file creation may involve partial reorganization on one or more disks. The paper proposes heuristic algorithms for the placement of dynamically created files. The algorithms provide a good compromise between maximizing 1/0 performance of the disk array and minimizing the work invested in partial reorganizations. The paper presents preliminary performance results of various alternative algorithms under a synthetic workload.

[1]  Herb Schwetman,et al.  CSIM† Reference Manual (Revision 16) , 1992 .

[2]  Peter Scheuermann,et al.  A parallel algorithm for record clustering , 1990, TODS.

[3]  David J. DeWitt,et al.  Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines , 1990, VLDB.

[4]  Jim Gray,et al.  Parity Striping of Disk Arrays: Low-Cost Reliable Storage with Acceptable Throughput , 1990, VLDB.

[5]  David A. Patterson,et al.  Maximizing performance in a striped disk array , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[6]  P. Zabback,et al.  Office documents on a database kernel—filing, retrieval, and archiving , 1990, COCS '90.

[7]  Hans-Jörg Schek,et al.  The DASDBS Project: Objectives, Experiences, and Future Prospects , 1990, IEEE Trans. Knowl. Data Eng..

[8]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[9]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[10]  David J. DeWitt,et al.  A multiuser performance analysis of alternative declustering strategies , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[11]  Gerhard Weikum The COMFORT project: a comfortable way to better performance , 1990 .

[12]  A. L. Narasimha Reddy,et al.  An Evaluation of Multiple-Disk I/O Systems , 1989, IEEE Trans. Computers.

[13]  Roger King,et al.  Cactis: a self-adaptive, concurrent implementation of an object-oriented database management system , 1989, ACM Trans. Database Syst..

[14]  Michael Stonebraker,et al.  A project on high performance I/0 subsystems , 1989, CARN.

[15]  Behrokh Samadi TUNEX: A Knowledge-Based System for Performance Tuning of the UNIX Operating System , 1989, IEEE Trans. Software Eng..

[16]  Bruce G. Lindsay,et al.  The Starburst Long Field Manager , 1989, VLDB.

[17]  Peter Scheuermann,et al.  Heuristic Reorganization of Clustered Files , 1989, FODO.

[18]  Tom W. Keller,et al.  A comparison of high-availability media recovery techniques , 1989, SIGMOD '89.

[19]  Randy H. Katz,et al.  Failure correction techniques for large disk arrays , 1989, ASPLOS III.

[20]  Gerhard Weikum Set-oriented disk access to large complex objects , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[21]  David J. DeWitt,et al.  Storage management for objects in EXODUS , 1989 .

[22]  Michael Stonebraker,et al.  The Design of XPRS , 1988, VLDB.

[23]  Dina Bitton,et al.  Disk Shadowing , 1988, VLDB.

[24]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[25]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[26]  Miron Livny,et al.  Multi-disk management algorithms , 1987, SIGMETRICS '87.

[27]  Stanley B. Zdonik,et al.  A shared, segmented memory system for an object-oriented database , 1987, TOIS.

[28]  Michelle Y. Kim,et al.  Synchronized Disk Interleaving , 1986, IEEE Transactions on Computers.

[29]  Hector Garcia-Molina,et al.  Disk striping , 1986, 1986 IEEE Second International Conference on Data Engineering.

[30]  John Kunze,et al.  A trace-driven analysis of the unix 4 , 1985, SOSP 1985.

[31]  John A. Kunze,et al.  A trace-driven analysis of the UNIX 4.2 BSD file system , 1985, SOSP '85.

[32]  Michael Stonebraker,et al.  The Case for Shared Nothing , 1985, HPTS.

[33]  David K. Gifford,et al.  The TWA reservation system , 1984, CACM.

[34]  Alan Jay Smith,et al.  Input/output optimization and disk architectures: A survey , 1981, Perform. Evaluation.