论文信息 - Evaluating the performance and scalability of the Ceph distributed storage system

Evaluating the performance and scalability of the Ceph distributed storage system

As the data needs in every field continue to grow, storage systems have to grow and therefore need to adapt to the increasing demands of performance, reliability and fault tolerance. This also increases their complexity and costs. Improving the performance and scalability of storage systems while maintaining low costs is thus crucial. The evaluated open source storage system Ceph promises to reliably store data distributed across many nodes. Ceph is targeted at commodity hardware. This study investigates how Ceph performs in different setups and compares this with the theoretical maximum performance of the hardware. We used a bottom-up approach to benchmark Ceph at different architectural levels. We varied the amount of storage nodes and clients to test the scalability of the system. Our experiments revealed that Ceph delivers the promised scalability, and uncovered several points with improvement potential. We observed a significant increase of the write throughput by moving the Ceph journal to a faster location (in memory). Moreover, while the system scaled with the increasing number of clients operating the cluster, we noticed a slight performance degradation after the saturation point. We tested two optimisation strategies - increasing the available RAM or the object size - and noted a write throughput increase of up to 9% and 27%, respectively. Our findings improve the understanding of Ceph and should benefit future users through the presented strategies for tackling various performance limitations.

[1] Ibm Redbooks,et al. Gpfs a Parallel File System , 1998 .

[2] Frank B. Schmuck,et al. GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[3] Carlos Maltzahn,et al. Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[4] S.A. Brandt,et al. CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[5] Tom White,et al. Hadoop: The Definitive Guide , 2009 .

[6] Hairong Kuang,et al. The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[7] Rick Cattell,et al. Scalable SQL and NoSQL data stores , 2011, SGMD.

[8] Karlheinz Meier,et al. Introducing the Human Brain Project , 2011, FET.

[9] Andre Oriani,et al. From Backup to Hot Standby: High Availability for HDFS , 2012, 2012 IEEE 31st Symposium on Reliable Distributed Systems.

[10] Feiyi Wang,et al. Performance and scalability evaluation of the Ceph parallel file system , 2013, PDSW@SC.

[11] Daniel van der Ster,et al. Building an organic block storage service at CERN with Ceph , 2014 .