Achieving Private Information Retrieval Capacity in Distributed Storage Using an Arbitrary Linear Code

We propose three private information retrieval (PIR) protocols for distributed storage systems (DSSs) where data is stored using an arbitrary linear code. The first two protocols, named Protocol 1 and Protocol 2, achieve privacy for the scenario with noncolluding nodes. Protocol 1 requires a file size that is exponential in the number of files in the system, while the file size required for Protocol 2 is independent of the number of files and is hence simpler. We prove that, for certain linear codes, Protocol 1 achieves the PIR capacity, i.e., its PIR rate (the ratio of the amount of retrieved stored data per unit of downloaded data) is the maximum possible for any given (finite and infinite) number of files, and Protocol 2 achieves the asymptotic PIR capacity (with infinitely large number of files in the DSS). In particular, we provide a sufficient and a necessary condition for a code to be PIR capacity-achieving and prove that cyclic codes, Reed-Muller (RM) codes, and optimal information locality local reconstruction codes achieve both the finite PIR capacity (i.e., with any given number of files) and the asymptotic PIR capacity with Protocols 1 and 2, respectively. Furthermore, we present a third protocol, Protocol 3, for the scenario with multiple colluding nodes, which can be seen as an improvement of a protocol recently introduced by Freij-Hollanti et al. We also present an algorithm to optimize the PIR rate of the proposed protocol. Finally, we provide a particular class of codes that is suitable for this protocol and show that RM codes achieve the maximum possible PIR rate for the protocol.

[1]  Eitan Yaakobi,et al.  Codes for distributed PIR with low storage overhead , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[2]  Charles R. Johnson,et al.  Matrix Analysis, 2nd Ed , 2012 .

[3]  Hua Sun,et al.  Private Information Retrieval from MDS Coded Data With Colluding Servers: Settling a Conjecture by Freij-Hollanti et al. , 2018, IEEE Transactions on Information Theory.

[4]  Tor Helleseth,et al.  On the minimum distance of array codes as LDPC codes , 2003, IEEE Trans. Inf. Theory.

[5]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[6]  P. Vijay Kumar,et al.  Codes With Local Regeneration and Erasure Correction , 2014, IEEE Transactions on Information Theory.

[7]  Hua Sun,et al.  The Capacity of Private Information Retrieval , 2016, 2016 IEEE Global Communications Conference (GLOBECOM).

[8]  A. Ashikhmin Generalized Hamming Weights for &-Linear Codes , 2015 .

[9]  Yuval Ishai,et al.  Breaking the O(n/sup 1/(2k-1)/) barrier for information-theoretic Private Information Retrieval , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[10]  Minghua Chen,et al.  Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems , 2007, Sixth IEEE International Symposium on Network Computing and Applications (NCA 2007).

[11]  Hua Sun,et al.  The Capacity of Symmetric Private Information Retrieval , 2019, IEEE Transactions on Information Theory.

[12]  Itzhak Tamo,et al.  A Family of Optimal Locally Recoverable Codes , 2013, IEEE Transactions on Information Theory.

[13]  Marcel Ambroze,et al.  On the Minimum/Stopping Distance of Array Low-Density Parity-Check Codes , 2012, IEEE Transactions on Information Theory.

[14]  Jennifer D. Key,et al.  Information sets and partial permutation decoding for codes from finite geometries , 2006, Finite Fields Their Appl..

[15]  Hua Sun,et al.  The Capacity of Robust Private Information Retrieval With Colluding Databases , 2016, IEEE Transactions on Information Theory.

[16]  Shu-Tao Xia,et al.  Constructions of Optimal Binary Locally Repairable Codes With Multiple Repair Groups , 2016, IEEE Communications Letters.

[17]  Camilla Hollanti,et al.  Reed-Muller Codes for Private Information Retrieval , 2017 .

[18]  Irving S. Reed,et al.  A class of multiple-error-correcting codes and the decoding scheme , 1954, Trans. IRE Prof. Group Inf. Theory.

[19]  W. Cary Huffman,et al.  Fundamentals of Error-Correcting Codes , 1975 .

[20]  Klim Efremenko,et al.  3-Query Locally Decodable Codes of Subexponential Length , 2008 .

[21]  Itzhak Tamo,et al.  Optimal Repair of Reed-Solomon Codes: Achieving the Cut-Set Bound , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[22]  Kannan Ramchandran,et al.  One extra bit of download ensures perfectly private information retrieval , 2014, 2014 IEEE International Symposium on Information Theory.

[23]  Hirosuke Yamamoto,et al.  Private information retrieval for coded storage , 2014, 2015 IEEE International Symposium on Information Theory (ISIT).

[24]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[25]  Alexandre Graell i Amat,et al.  Private information retrieval in distributed storage systems using an arbitrary linear code , 2016, 2017 IEEE International Symposium on Information Theory (ISIT).

[26]  Sergey Yekhanin,et al.  Towards 3-query locally decodable codes of subexponential length , 2008, JACM.

[27]  Camilla Hollanti,et al.  Private Information Retrieval from Coded Databases with Colluding Servers , 2016, SIAM J. Appl. Algebra Geom..

[28]  Sennur Ulukus,et al.  The Capacity of Private Information Retrieval From Coded Databases , 2016, IEEE Transactions on Information Theory.

[29]  E. Kushilevitz,et al.  Barrier for Information-Theoretic Private Information Retrieval , 2002 .

[30]  C. Feyling Punctured maximum distance separable codes , 1993 .

[31]  Gennian Ge,et al.  Private Information Retrieval from MDS Coded Databases with Colluding Servers under Several Variant Models , 2017, 1705.03186.

[32]  Jean-Marie Goethals,et al.  On Generalized Reed-Muller Codes and Their Relatives , 1970, Inf. Control..

[33]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[34]  Mikael Skoglund,et al.  Secure symmetric private information retrieval from colluding databases with adversaries , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[35]  Rafail Ostrovsky,et al.  Batch codes and their applications , 2004, STOC '04.

[36]  Eyal Kushilevitz,et al.  Private information retrieval , 1998, JACM.

[37]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[38]  Gennian Ge,et al.  A general private information retrieval scheme for MDS coded databases with colluding servers , 2017, Designs, Codes and Cryptography.

[39]  Mikael Skoglund,et al.  Linear symmetric private information retrieval for MDS coded distributed storage with colluding servers , 2017, 2017 IEEE Information Theory Workshop (ITW).

[40]  Gennian Ge,et al.  Multi-file Private Information Retrieval from MDS Coded Databases with Colluding Servers , 2017, ArXiv.

[41]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[42]  Oliver W. Gnilke,et al.  Private Information Retrieval From MDS Coded Data in Distributed Storage Systems , 2018, IEEE Transactions on Information Theory.