Crawling the IPFS Network

IPFS is a distributed data storage service frequently used by blockchain applications and for sharing content in a censorship-resistant manner. Data is hosted by an open set of peers, pointers to both are distributed using a Kademlia-based distributed hash table (DHT). In this demo, we present a crawler for the IPFS overlay network (ipfs_crawler) that can be used to study and monitor the network’s structure. Therefore, ipfs_crawler is an important building block when assessing the state and health of the network, as the overlay network significantly influences the robustness and performance of IPFS. Specifically, ipfs_crawler systematically traverses the Kademlia DHT of IPFS to enumerate peers in the network and a subset of the connections between peers. Since network communication in IPFS is carried out through the libp2p networking library, ipfs_crawler can easily be adapted to crawl other libp2p-based networks.

[1]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[2]  Sebastian Henningsen,et al.  Mapping the Interplanetary Filesystem , 2020, 2020 IFIP Networking Conference (Networking).

[3]  Juan Benet,et al.  IPFS - Content Addressed, Versioned, P2P File System , 2014, ArXiv.

[4]  Taoufik En-Najjary,et al.  A global view of kad , 2007, IMC '07.

[5]  Johan A. Pouwelse,et al.  The Bittorrent P2P File-Sharing System: Measurements and Analysis , 2005, IPTPS.

[6]  Daniel Stutzbach,et al.  Capturing Accurate Snapshots of the Gnutella Network , 2005, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[7]  Daniel Stutzbach,et al.  Evaluating the Accuracy of Captured Snapshots by Peer-to-Peer Crawlers , 2005, PAM.