As a cloud storage platform software, Ceph can be used to obtain petabyte-scale storage system built from commodity hardware [1] [2] [3]. In the previous work, we optimized the performance of reads/writes from/to the Ceph storage cluster based on multi-threaded algorithms [4]. Experiment results indicate that the performance of small files reads/writes and large files reads algorithms improve obviously, but the performance of the large files writes to Ceph by the single pipeline algorithm has no evident improvement. To address this problem, we use multiple pipelines algorithm to optimize the performance of the large files writes to the Ceph storage cluster. The experiment results show that when the size of the data block is set to 10MB, the maximal performance improvement percentage of the multiple pipelines algorithm running on the two logical CPUs machine is 100.70%. At the same time, limited by the multi-threaded mechanism of Python language, such as Python GIL and threads thrashing, the performance of the multiple pipelines algorithm running on multiple cores machines does not increase linearly. We intend to optimize the performance of large files writes algorithm using C++ version application program interface of Ceph for the future work.
[1]
Ke Zhan,et al.
Optimization of Ceph Reads/Writes Based on Multi-threaded Algorithms
,
2016,
2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS).
[2]
S.A. Brandt,et al.
CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data
,
2006,
ACM/IEEE SC 2006 Conference (SC'06).
[3]
Jason J Hill,et al.
Ceph Parallel File System Evaluation Report
,
2013
.
[4]
Scott A. Brandt,et al.
Dynamic Metadata Management for Petabyte-Scale File Systems
,
2004,
Proceedings of the ACM/IEEE SC2004 Conference.
[5]
Carlos Maltzahn,et al.
Ceph: a scalable, high-performance distributed file system
,
2006,
OSDI '06.
[6]
Heon Young Yeom,et al.
Performance Optimization for All Flash Scale-Out Storage
,
2016,
2016 IEEE International Conference on Cluster Computing (CLUSTER).