Sparse graph codes for compression, sensing, and secrecy

Sparse graph codes were first introduced by Gallager over 40 years ago. Over the last two decades, such codes have been the subject of intense research, and capacity-approaching sparse graph codes with low complexity encoding and decoding algorithms have been designed for many channels. Motivated by the success of sparse graph codes for channel coding, we explore the use of sparse graph codes for four other problems related to compression, sensing, and security. First, we construct locally encodable and decodable source codes for a simple class of sources. Local encodability refers to the property that when the original source data changes slightly, the compression produced by the source code can be updated easily. Local decodability refers to the property that a single source symbol can be recovered without having to decode the entire source block. Second, we analyze a simple message-passing algorithm for compressed sensing recovery, and show that our algorithm provides a nontrivial e 1/e1 guarantee. We also show that very sparse matrices and matrices whose entries must be either 0 or 1 have poor performance with respect to the restricted isometry property for the e2 norm. Third, we analyze the performance of a special class of sparse graph codes, LDPC codes, for the problem of quantizing a uniformly random bit string under Hamming distortion. We show that LDPC codes can come arbitrarily close to the rate-distortion bound using an optimal quantizer. This is a special case of a general result showing a duality between lossy source coding and channel coding—if we ignore computational complexity, then good channel codes are automatically good lossy source codes. We also prove a lower bound on the average degree of vertices in an LDPC code as a function of the gap to the rate-distortion bound. Finally, we construct efficient, capacity-achieving codes for the wiretap channel, a model of communication that allows one to provide information-theoretic, rather than computational, security guarantees. Our main results include the introduction of a new security critertion which is an information-theoretic analog of semantic security, the construction of capacity-achieving codes possessing strong security with nearly linear time encoding and decoding algorithms for any degraded wiretap channel, and the construction of capacity-achieving codes possessing semantic security with linear time encoding and decoding algorithms for erasure wiretap channels. Our analysis relies on a relatively small set of tools. One tool is density evolution, a powerful method for analyzing the behavior of message-passing algorithms on long, random sparse graph codes. Another concept we use extensively is the notion of an expander graph. Expander graphs have powerful properties that allow us to prove adversarial, rather than probabilistic, guarantees for message-passing algorithms. Expander graphs are also useful in the context of the wiretap channel because they provide a method for constructing randomness extractors. Finally, we use several well-known isoperimetric inequalities (Harper's inequality, Azuma's inequality, and the Gaussian Isoperimetric inequality) in our analysis of the duality between lossy source coding and channel coding. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  Yevgeniy Dodis,et al.  Non-malleable extractors and symmetric key cryptography from weak secrets , 2009, STOC '09.

[2]  Venkatesan Guruswami,et al.  Unbalanced expanders and randomness extractors from Parvaresh--Vardy codes , 2009, JACM.

[3]  Andrea Montanari,et al.  Counter braids: a novel counter architecture for per-flow measurement , 2008, SIGMETRICS '08.

[4]  I. Sergeev,et al.  The complexity and depth of Boolean circuits for multiplication and inversion in some fields GF(2n) , 2009 .

[5]  Venkatesan Guruswami,et al.  Euclidean Sections of with Sublinear Randomness and Error-Correction over the Reals , 2008, APPROX-RANDOM.

[6]  A. Lubotzky,et al.  Ramanujan graphs , 2017, Comb..

[7]  S. Srinivasa Rao,et al.  Space Efficient Suffix Trees , 1998, J. Algorithms.

[8]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[9]  T. Etzion,et al.  Which codes have cycle-free Tanner graphs? , 1998, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[10]  Daniel A. Spielman,et al.  Linear-time encodable and decodable error-correcting codes , 1995, STOC '95.

[11]  Weiyu Xu,et al.  Efficient Compressive Sensing with Deterministic Guarantees Using Expander Graphs , 2007, 2007 IEEE Information Theory Workshop.

[12]  A. D. Wyner,et al.  The wire-tap channel , 1975, The Bell System Technical Journal.

[13]  Andrea Montanari,et al.  Message passing algorithms for compressed sensing: II. analysis and validation , 2009, 2010 IEEE Information Theory Workshop on Information Theory (ITW 2010, Cairo).

[14]  Rüdiger L. Urbanke,et al.  Efficient encoding of low-density parity-check codes , 2001, IEEE Trans. Inf. Theory.

[15]  Vivek K. Goyal,et al.  Malleable Coding: Compressed Palimpsests , 2008, ArXiv.

[16]  Jon Feldman,et al.  Decoding error-correcting codes via linear programming , 2003 .

[17]  Graham Cormode,et al.  Combinatorial Algorithms for Compressed Sensing , 2006 .

[18]  Silvio Micali,et al.  Probabilistic encryption & how to play mental poker keeping secret all partial information , 1982, STOC '82.

[19]  David Zuckerman,et al.  DETERMINISTIC EXTRACTORS FOR BIT-FIXING SOURCES AND EXPOSURE-RESILIENT CRYPTOGRAPHY , 2003 .

[20]  Moses Charikar,et al.  Finding frequent items in data streams , 2004, Theor. Comput. Sci..

[21]  Alexandros G. Dimakis,et al.  Sparse Recovery of Nonnegative Signals With Minimal Expansion , 2011, IEEE Transactions on Signal Processing.

[22]  Andrea Montanari,et al.  Smooth compression, Gallager bound and nonlinear sparse-graph codes , 2008, 2008 IEEE International Symposium on Information Theory.

[23]  Venkatesan Guruswami,et al.  Explicit interleavers for a Repeat Accumulate Accumulate (RAA) code construction , 2008, 2008 IEEE International Symposium on Information Theory.

[24]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[25]  C. SIAMJ. LOW REDUNDANCY IN STATIC DICTIONARIES WITH CONSTANT QUERY TIME , 2001 .

[26]  Martin J. Wainwright,et al.  LP Decoding Corrects a Constant Fraction of Errors , 2004, IEEE Transactions on Information Theory.

[27]  Graham Cormode,et al.  An Improved Data Stream Summary: The Count-Min Sketch and Its Applications , 2004, LATIN.

[28]  P. Indyk 895 : Sketching , Streaming and Sub-linear Space Algorithms , 2008 .

[29]  Rüdiger L. Urbanke,et al.  The capacity of low-density parity-check codes under message-passing decoding , 2001, IEEE Trans. Inf. Theory.

[30]  P. Vontobel,et al.  Constructions of LDPC Codes using Ramanujan Graphs and Ideas from Margulis , 2000 .

[31]  R. Vershynin,et al.  One sketch for all: fast algorithms for compressed sensing , 2007, STOC '07.

[32]  G. Forney,et al.  Codes on graphs: normal realizations , 2000, 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060).

[33]  Alexandros G. Dimakis,et al.  Sparse Recovery of Positive Signals with Minimal Expansion , 2009, ArXiv.

[34]  Thomas J. Richardson,et al.  An Introduction to the Analysis of Iterative Coding Systems , 2001 .

[35]  Oded Goldreich,et al.  The bit extraction problem or t-resilient functions , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[36]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[37]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[38]  Eli Ben-Sasson,et al.  Affine dispersers from subspace polynomials , 2009, STOC '09.

[39]  Imre Csiszár,et al.  Broadcast channels with confidential messages , 1978, IEEE Trans. Inf. Theory.

[40]  Simon Litsyn,et al.  On ensembles of low-density parity-check codes: Asymptotic distance distributions , 2002, IEEE Trans. Inf. Theory.

[41]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[42]  Mohammad Mahdian,et al.  The Minimum Distance of Turbo-Like Codes , 2009, IEEE Transactions on Information Theory.

[43]  Michael Luby,et al.  LT codes , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[44]  David R. Clark,et al.  Efficient suffix trees on secondary storage , 1996, SODA '96.

[45]  Piotr Indyk,et al.  Sparse Recovery Using Sparse Random Matrices , 2010, LATIN.

[46]  D.J.C. MacKay,et al.  Good error-correcting codes based on very sparse matrices , 1997, Proceedings of IEEE International Symposium on Information Theory.

[47]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[48]  Thomas M. Cover,et al.  Enumerative source encoding , 1973, IEEE Trans. Inf. Theory.

[49]  Elwyn R. Berlekamp,et al.  On the inherent intractability of certain coding problems (Corresp.) , 1978, IEEE Trans. Inf. Theory.

[50]  Aaron D. Wyner,et al.  Coding Theorems for a Discrete Source With a Fidelity CriterionInstitute of Radio Engineers, International Convention Record, vol. 7, 1959. , 1993 .

[51]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[52]  Daniel A. Spielman,et al.  Efficient erasure correcting codes , 2001, IEEE Trans. Inf. Theory.

[53]  Rüdiger L. Urbanke,et al.  Parity-check density versus performance of binary linear block codes over memoryless symmetric channels , 2003, IEEE Transactions on Information Theory.

[54]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[55]  Dariush Divsalar,et al.  Coding theorems for 'turbo-like' codes , 1998 .

[56]  Claude E. Shannon,et al.  Communication theory of secrecy systems , 1949, Bell Syst. Tech. J..

[57]  Rüdiger L. Urbanke,et al.  Modern Coding Theory , 2008 .

[58]  Sergey Yekhanin,et al.  Secure Biometrics Via Syndromes , 2005 .

[59]  Andrea Montanari,et al.  Message passing algorithms for compressed sensing: I. motivation and construction , 2009, 2010 IEEE Information Theory Workshop on Information Theory (ITW 2010, Cairo).

[60]  H. Yamamoto,et al.  A coding theorem for lossy data compression by LDPC codes , 2002, Proceedings IEEE International Symposium on Information Theory,.

[61]  Guy Kindler,et al.  Simulating independence: new constructions of condensers, ramsey graphs, dispersers, and extractors , 2005, STOC '05.

[62]  Alexander Barg,et al.  Complexity Issues in Coding Theory , 1997, Electron. Colloquium Comput. Complex..

[63]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[64]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[65]  Leonid A. Levin,et al.  Pseudo-random generation from one-way functions , 1989, STOC '89.

[66]  Peter Elias,et al.  Universal codeword sets and representations of the integers , 1975, IEEE Trans. Inf. Theory.

[67]  G. Hardy,et al.  An Introduction to the Theory of Numbers , 1938 .

[68]  Avi Wigderson,et al.  Randomness conductors and constant-degree lossless expanders , 2002, Proceedings 17th IEEE Annual Conference on Computational Complexity.

[69]  Rüdiger L. Urbanke,et al.  Design of capacity-approaching irregular low-density parity-check codes , 2001, IEEE Trans. Inf. Theory.

[70]  Noga Alon,et al.  Explicit unique-neighbor expanders , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[71]  Achilleas Anastasopoulos,et al.  Capacity-Achieving Codes with Bounded Graphical Complexity on Noisy Channels , 2005, ArXiv.

[72]  A. Glavieux,et al.  Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1 , 1993, Proceedings of ICC '93 - IEEE International Conference on Communications.

[73]  Rajeev Motwani,et al.  Randomized algorithms , 1996, CSUR.

[74]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[75]  Imre Csiszár,et al.  Information Theory - Coding Theorems for Discrete Memoryless Systems, Second Edition , 2011 .

[76]  S. Srinivasa Rao,et al.  An optimal Bloom filter replacement , 2005, SODA '05.

[77]  Martin J. Wainwright,et al.  Lossy source encoding via message-passing and decimation over generalized codewords of LDGM codes , 2005, Proceedings. International Symposium on Information Theory, 2005. ISIT 2005..

[78]  David Burshtein,et al.  Expander graph arguments for message-passing algorithms , 2001, IEEE Trans. Inf. Theory.

[79]  L. H. Harper Optimal numberings and isoperimetric problems on graphs , 1966 .

[80]  Piotr Indyk,et al.  Combining geometry and combinatorics: A unified approach to sparse signal recovery , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[81]  Ueli Maurer,et al.  Information-Theoretic Key Agreement: From Weak to Strong Secrecy for Free , 2000, EUROCRYPT.

[82]  Sae-Young Chung,et al.  On the design of low-density parity-check codes within 0.0045 dB of the Shannon limit , 2001, IEEE Communications Letters.

[83]  Guy Jacobson,et al.  Space-efficient static trees and graphs , 1989, 30th Annual Symposium on Foundations of Computer Science.

[84]  Hui Jin,et al.  Irregular Repeat – Accumulate Codes 1 , 2000 .

[85]  Rüdiger L. Urbanke,et al.  Polar Codes for Channel and Source Coding , 2009, ArXiv.

[86]  N. Linial,et al.  Expander Graphs and their Applications , 2006 .

[87]  Rajeev Raman,et al.  On the Redundancy of Succinct Data Structures , 2008, SWAT.

[88]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[89]  Amin Shokrollahi,et al.  LDPC Codes: An Introduction , 2004 .

[90]  Igal Sason,et al.  Accumulate-Repeat-Accumulate Codes: Systematic Codes Achieving the Binary Erasure Channel Capacity with Bounded Complexity , 2005, ArXiv.

[91]  Борис Сергеевич Кашин,et al.  Замечание о задаче сжатого измерения@@@A Remark on Compressed Sensing , 2007 .

[92]  D. Spielman,et al.  Expander codes , 1996 .

[93]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[94]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[95]  Rajeev Raman,et al.  Succinct indexable dictionaries with applications to encoding k-ary trees and multisets , 2002, SODA '02.

[96]  Piotr Indyk Explicit constructions for compressed sensing of sparse signals , 2008, SODA '08.

[97]  Enkatesan G Uruswami Unbalanced expanders and randomness extractors from Parvaresh-Vardy codes , 2008 .

[98]  Larry Carter,et al.  Exact and approximate membership testers , 1978, STOC.

[99]  Deanna Needell,et al.  CoSaMP: Iterative signal recovery from incomplete and inaccurate samples , 2008, ArXiv.

[100]  John L. Smith Tables , 1969, Neuromuscular Disorders.

[101]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[102]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[103]  Piotr Indyk,et al.  Sequential Sparse Matching Pursuit , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[104]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[105]  Michael Lentmaier,et al.  An analysis of the block error probability performance of iterative decoding , 2005, IEEE Transactions on Information Theory.

[106]  Zvi Galil,et al.  Explicit Constructions of Linear-Sized Superconcentrators , 1981, J. Comput. Syst. Sci..

[107]  Amin Shokrollahi,et al.  Capacity-achieving sequences for the erasure channel , 2002, IEEE Trans. Inf. Theory.

[108]  Ronald A. DeVore,et al.  Deterministic constructions of compressed sensing matrices , 2007, J. Complex..

[109]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[110]  R. DeVore,et al.  A Simple Proof of the Restricted Isometry Property for Random Matrices , 2008 .

[111]  P. Indyk,et al.  Near-Optimal Sparse Recovery in the L1 Norm , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[112]  Emin Martinian,et al.  Iterative Quantization Using Codes On Graphs , 2004, ArXiv.

[113]  H. Vincent Poor,et al.  Secure Nested Codes for Type II Wiretap Channels , 2007, 2007 IEEE Information Theory Workshop.

[114]  Martin J. Wainwright,et al.  Low density codes achieve the rate-distortion bound , 2006, Data Compression Conference (DCC'06).

[115]  Joel A. Tropp,et al.  Algorithmic linear dimension reduction in the l_1 norm for sparse vectors , 2006, ArXiv.

[116]  Rajeev Raman,et al.  On the Size of Succinct Indices , 2007, ESA.

[117]  Shlomo Shamai,et al.  Compound Wiretap Channels , 2009, EURASIP J. Wirel. Commun. Netw..

[118]  A. Robert Calderbank,et al.  Applications of LDPC Codes to the Wiretap Channel , 2004, IEEE Transactions on Information Theory.

[119]  Rüdiger L. Urbanke,et al.  Capacity-achieving ensembles for the binary erasure channel with bounded complexity , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[120]  S. Muthukrishnan Some Algorithmic Problems and Results in Compressed Sensing , 2006 .

[121]  Yevgeniy Dodis,et al.  Correcting errors without leaking partial information , 2005, STOC '05.