A scalable data distributing method supporting replication and weight

To design a large-scale distributed storage system, availability, scalability, user experience, and resource utilization are all under consideration. Known approaches for distributing and locating a huge amount of objects in such systems cannot meet all the requirements. In this paper, we propose a new method which uses three-level mapping to locate objects. With this method, objects can be distributed among a large number of storage devices with high scalability. Clients can access objects in parallel without consulting a central server each time, and the cost to locate an object is one step in general. Moreover, replication and weighted allocation can also be supported, both of which are needed to permit systems to efficiently grow while accommodating new technology. Theoretical proofs and experiments show that our method is comparable to other methods, and can meet all the requirements to achieve an advanced large-scale distributed storage system.

[1]  Eric A. Brewer,et al.  Harvest, yield, and scalable tolerant systems , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[2]  Scott A. Brandt,et al.  Reliability mechanisms for very large storage systems , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[3]  Andrew J. Hutton,et al.  Lustre: Building a File System for 1,000-node Clusters , 2003 .

[4]  André Brinkmann,et al.  Reliable and randomized data distribution strategies for large scale storage systems , 2011, 2011 18th International Conference on High Performance Computing.

[5]  Witold Litwin,et al.  LH*—a scalable, distributed data structure , 1996, TODS.

[6]  Christian Scheideler,et al.  Efficient, distributed data placement strategies for storage area networks (extended abstract) , 2000, SPAA '00.

[7]  Kanishk Jain Object-based Storage , 2022 .

[8]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[9]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[10]  S.A. Brandt,et al.  CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[11]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[12]  GhemawatSanjay,et al.  The Google file system , 2003 .