Metadata Partitioning for Large-Scale Distributed Storage Systems

With the emergence of large-scale storage systems that separate metadata management from fileread/write operations, and with requests targetting metadata account for over 80\% of the total number of I/O requests, metadata management has become an interesting research problem on its own. When designing a metadata server cluster, the partitioning of the metadata among the servers is of critical importance for maintaining efficient metadata operations and balanced load distribution across the cluster. We propose a dynamic programming method combined with binary search to solve the partitioning problem. With theoretical analysis and extensive experiments, we show that our algorithm finds the partitioning that minimizes load imbalance among servers and maximize efficiency of metadata operations.