Distributed Storage Allocation for High Reliability

We consider the problem of optimally allocating a given total storage budget in a distributed storage system. A source has a data object which it can code and store over a set of storage nodes; it is allowed to store any amount of data in each storage node, subject to a given total storage budget constraint. A data collector subsequently attempts to recover the original data object by accessing a random fixed-size subset of these storage nodes. Successful recovery of the data object occurs when the total amount of coded data in this subset of storage nodes is at least the size of the original data object. The goal is to determine the amount of data to store in each storage node so that the probability of successful recovery is maximized. We solve this problem in the high recovery probability regime. Our results can be applied to a variety of distributed storage systems, including delay tolerant networks (DTNs), content delivery networks (CDNs), and sensor networks.

[1]  Emina Soljanin,et al.  Fountain Codes Based Distributed Storage Algorithms for Large-Scale Wireless Sensor Networks , 2008, 2008 International Conference on Information Processing in Sensor Networks (ipsn 2008).

[2]  Jon Feldman,et al.  Growth codes: maximizing sensor network data persistence , 2006, SIGCOMM 2006.

[3]  Rabin K. Patra,et al.  Using redundancy to cope with failures in a delay tolerant network , 2005, SIGCOMM '05.

[4]  Baochun Li,et al.  Data Persistence in Large-Scale Sensor Networks with Decentralized Fountain Codes , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[5]  Tracey Ho,et al.  A Random Linear Network Coding Approach to Multicast , 2006, IEEE Transactions on Information Theory.

[6]  Jörg Widmer,et al.  Network coding: an instant primer , 2006, CCRV.

[7]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[8]  Vinod M. Prabhakaran,et al.  Ubiquitous access to distributed data in large-scale sensor networks through decentralized erasure codes , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[9]  Alexandros G. Dimakis,et al.  Distributed storage allocation problems , 2009, 2009 Workshop on Network Coding, Theory, and Applications.

[10]  Anxiao Jiang Network Coding for Joint Storage and Transmission with Minimum Cost , 2006, 2006 IEEE International Symposium on Information Theory.