Scalable randomization for dynamic data storage
暂无分享,去创建一个
A scalable storage architecture is important in systems that store and access growing data sets. Scalable storage provides the ability to increase or decrease the overall storage size as needed. A truly scalable system should accept any amount of growth in data size without any hindrance to system performance.
This dissertation describes a scalable randomization algorithm to enable an efficient, dynamically expandable heterogeneous storage system. While previous techniques can support a balanced load and quick data retrieval, they fail in efficiency of data movement during scaling. Other techniques provide minimal movement, but do not offer incremental scaling. The challenge is to find a scaling solution that minimizes the data movement while maintaining a balanced load and providing fast data access.
The proposed 2-staged scaling solution promises these qualities through the combination of the RDL and SCADDAR algorithms. RDL allows repeated scaling to increase or decrease the storage. Scaling operations with RDL can be repeated indefinitely as long as the amount of storage remains below an upper limit. SCADDAR is activated if this limit is exceeded. The performance of SCADDAR scaling endures a slight increase in access time after each operation. Eventually, a complete data reorganization improves this access time by evolving the system to a new RDL state with a higher upper limit, allowing scaling to resume.
Advances in technology inevitably result in heterogeneous disks that replace old homogeneous disks. Support for heterogeneous disk scaling is accomplished through the BroadScale algorithm. Although maximizing the resources of heterogeneous disks have been previously studied, dynamic scaling was not considered. When scaling heterogeneous disks, BroadScale takes into account both the disk bandwidth and the capacity.
The 2-staged scaling solution has been implemented with the Yima real-time continuous media server for evaluation and analysis. This scaling solution can be viewed as a new type of randomized file organization with the characteristics of extendibility with minimal data migration while maintaining load balancing and fast data access.