Understanding I/O Performance of IPFS Storage: A Client's Perspective

IPFS has surged into popularity in recent years. It organizes user data as multiple objects where users can obtain the objects according to their Content IDentifiers (CIDs). As a storage system, it is of great importance to understand its data I/O performance. But existing work still lacks such a comprehensive study. In this work, we deploy an IPFS storage system with geographically-distributed storage nodes on Amazon EC2. We then conduct extensive experiments to evaluate the performance of data I/O operations from a client's perspective. We find that the access patterns of I/O operations (e.g., request size) severely affect the I/O performance, since IPFS typically uses multiple I/O strategies to perform different I/O requests. Moreover, for the read operations, IPFS requires to resolve remote nodes and downloading objects via the internet. Our experimental study reveals that both resolving and downloading operations can become bottlenecks. Our results can shed light to optimizing IPFS in avoiding high-latency I/O operations.

[1]  Ingmar Baumgart,et al.  S/Kademlia: A practicable approach towards secure key-based routing , 2007, 2007 International Conference on Parallel and Distributed Systems.

[2]  Baochun Li,et al.  Zebra: Demand-aware erasure coding for distributed storage systems , 2016, 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS).

[3]  Xin Wang,et al.  Towards Operational Cost Minimization in Hybrid Clouds for Dynamic Resource Provisioning with Delay-Aware Optimization , 2015, IEEE Transactions on Services Computing.

[4]  Yaling Zhang,et al.  A Blockchain-Based Framework for Data Sharing With Fine-Grained Access Control in Decentralized Storage Systems , 2018, IEEE Access.

[5]  Robert Tappan Morris,et al.  Comparing the Performance of Distributed Hash Tables Under Churn , 2004, IPTPS.

[6]  Li Li,et al.  A measurement study on Skype voice and video calls in LTE networks on high speed rails , 2017, 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS).

[7]  David Mazières,et al.  Democratizing Content Publication with Coral , 2004, NSDI.

[8]  Hui Li,et al.  An improved P2P file system scheme based on IPFS and Blockchain , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[9]  Moni Naor,et al.  Viceroy: a scalable and dynamic emulation of the butterfly , 2002, PODC '02.

[10]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[11]  Benoît Parrein,et al.  Distributed File System Based on Erasure Coding for I/O-Intensive Applications , 2014, CLOSER.

[12]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[13]  Hai Jin,et al.  On the performance of cloud storage applications with global measurement , 2016, 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS).

[14]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[15]  Anjali Gupta,et al.  One Hop Lookups for Peer-to-Peer Overlays , 2003, HotOS.

[16]  Christoforos E. Kozyrakis,et al.  Understanding Ephemeral Storage for Serverless Analytics , 2018, USENIX Annual Technical Conference.

[17]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[18]  Zhonghong Ou,et al.  Understanding I/O performance behaviors of cloud storage from a client's perspective , 2016, 2016 32nd Symposium on Mass Storage Systems and Technologies (MSST).

[19]  Bastien Confais,et al.  An Object Store Service for a Fog/Edge Computing Infrastructure Based on IPFS and a Scale-Out NAS , 2017, 2017 IEEE 1st International Conference on Fog and Edge Computing (ICFEC).

[20]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[21]  Michael L. Nelson,et al.  InterPlanetary Wayback: Peer-To-Peer Permanence of Web Archives , 2016, TPDL.

[22]  B. Cohen,et al.  Incentives Build Robustness in Bit-Torrent , 2003 .

[23]  Yi Li,et al.  Mobile Cloud-of-Clouds Storage Made Efficient: A Network Coding Based Approach , 2018, 2018 IEEE 37th Symposium on Reliable Distributed Systems (SRDS).

[24]  Alysson Neves Bessani,et al.  Ginja: one-dollar cloud-based disaster recovery for databases , 2017, Middleware.

[25]  Xin Wang,et al.  Efficient Scheduling for Multi-Block Updates in Erasure Coding Based Storage Systems , 2018, IEEE Transactions on Computers.

[26]  Antonio A. Sánchez-Ruiz-Granados,et al.  Towards a Decentralized Process for Scientific Publication and Peer Review using Blockchain and IPFS , 2019, HICSS.

[27]  Ben Y. Zhao,et al.  Tapestry: a resilient global-scale overlay for service deployment , 2004, IEEE Journal on Selected Areas in Communications.

[28]  Juan Benet,et al.  IPFS - Content Addressed, Versioned, P2P File System , 2014, ArXiv.