Statistical Physics and Information Theory Perspectives on Linear Inverse Problems

Many real-world problems in machine learning, signal processing, and communications assume that an unknown vector $x$ is measured by a matrix A, resulting in a vector $y=Ax+z$, where $z$ denotes the noise; we call this a single measurement vector (SMV) problem. Sometimes, multiple dependent vectors $x^{(j)}, j\in \{1,...,J\}$, are measured at the same time, forming the so-called multi-measurement vector (MMV) problem. Both SMV and MMV are linear models (LM's), and the process of estimating the underlying vector(s) $x$ from an LM given the matrices, noisy measurements, and knowledge of the noise statistics, is called a linear inverse problem. In some scenarios, the matrix A is stored in a single processor and this processor also records its measurements $y$; this is called centralized LM. In other scenarios, multiple sites are measuring the same underlying unknown vector $x$, where each site only possesses part of the matrix A; we call this multi-processor LM. Recently, due to an ever-increasing amount of data and ever-growing dimensions in LM's, it has become more important to study large-scale linear inverse problems. In this dissertation, we take advantage of tools in statistical physics and information theory to advance the understanding of large-scale linear inverse problems. The intuition of the application of statistical physics to our problem is that statistical physics deals with large-scale problems, and we can make an analogy between an LM and a thermodynamic system. In terms of information theory, although it was originally developed to characterize the theoretic limits of digital communication systems, information theory was later found to be rather useful in analyzing and understanding other inference problems. (The full abstract cannot fit in due to the space limit. Please refer to the PDF.)

[1]  Andrea Montanari,et al.  The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, ISIT.

[2]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[3]  Dror Baron,et al.  Signal Estimation With Additive Error Metrics in Compressed Sensing , 2012, IEEE Transactions on Information Theory.

[4]  Nikhil Krishnan,et al.  Empirical Bayes and Full Bayes for Signal Estimation , 2014, ArXiv.

[5]  Richard G. Baraniuk,et al.  Bayesian Compressive Sensing Via Belief Propagation , 2008, IEEE Transactions on Signal Processing.

[6]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[7]  Sundeep Rangan,et al.  Generalized approximate message passing for estimation with random linear mixing , 2010, 2011 IEEE International Symposium on Information Theory Proceedings.

[8]  Yonina C. Eldar,et al.  Compressed Sensing with Coherent and Redundant Dictionaries , 2010, ArXiv.

[9]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[10]  Richard G. Baraniuk,et al.  Distributed Compressed Sensing Dror , 2005 .

[11]  Tsachy Weissman,et al.  Rate-distortion via Markov chain Monte Carlo , 2008, 2008 IEEE International Symposium on Information Theory.

[12]  João M. F. Xavier,et al.  Distributed Basis Pursuit , 2010, IEEE Transactions on Signal Processing.

[13]  Gregory J. Pottie,et al.  Wireless integrated network sensors , 2000, Commun. ACM.

[14]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[15]  Ruixin Niu,et al.  Multi-processor approximate message passing using lossy compression , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Martin Vetterli,et al.  Rate Distortion Behavior of Sparse Sources , 2012, IEEE Transactions on Information Theory.

[17]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[18]  Ramji Venkataramanan,et al.  Finite-sample analysis of Approximate Message Passing , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[19]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[20]  Andrea Montanari,et al.  Accurate Prediction of Phase Transitions in Compressed Sensing via a Connection to Minimax Denoising , 2011, IEEE Transactions on Information Theory.

[21]  Kenneth Rose,et al.  A mapping approach to rate-distortion computation and analysis , 1994, IEEE Trans. Inf. Theory.

[22]  Sundeep Rangan,et al.  Asymptotic Analysis of MAP Estimation via the Replica Method and Applications to Compressed Sensing , 2009, IEEE Transactions on Information Theory.

[23]  Guillermo Sapiro,et al.  Universal Regularizers for Robust Sparse Coding and Modeling , 2010, IEEE Transactions on Image Processing.

[24]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[25]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[26]  North Carolina,et al.  Recovery from Linear Measurements with Complexity-Matching Universal Signal Estimation , 2017 .

[27]  Guillermo Sapiro,et al.  An MDL Framework for Sparse Coding and Dictionary Learning , 2011, IEEE Transactions on Signal Processing.

[28]  Andrea Montanari,et al.  The Noise-Sensitivity Phase Transition in Compressed Sensing , 2010, IEEE Transactions on Information Theory.

[29]  Sergio Verdú,et al.  MMSE Dimension , 2010, IEEE Transactions on Information Theory.

[30]  Robert D. Nowak,et al.  Signal Reconstruction From Noisy Random Projections , 2006, IEEE Transactions on Information Theory.

[31]  S. Stenholm Information, Physics and Computation, by Marc Mézard and Andrea Montanari , 2010 .

[32]  Florent Krzakala,et al.  Performance Limits for Noisy Multimeasurement Vector Problems , 2016, IEEE Transactions on Signal Processing.

[33]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[34]  Enrico Magli,et al.  Distributed Iterative Thresholding for $\ell _{0}/\ell _{1}$ -Regularized Linear Inverse Problems , 2015, IEEE Transactions on Information Theory.

[35]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[36]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[37]  JORMA RISSANEN,et al.  A universal data compression system , 1983, IEEE Trans. Inf. Theory.

[38]  Yonina C. Eldar,et al.  Reduce and Boost: Recovering Arbitrary Sets of Jointly Sparse Vectors , 2008, IEEE Transactions on Signal Processing.

[39]  Philip Schniter,et al.  Efficient High-Dimensional Inference in the Multiple Measurement Vector Problem , 2011, IEEE Transactions on Signal Processing.

[40]  Sergio Verdú,et al.  Optimal Phase Transitions in Compressed Sensing , 2011, IEEE Transactions on Information Theory.

[41]  Sundeep Rangan,et al.  Estimation with random linear mixing, belief propagation and compressed sensing , 2010, 2010 44th Annual Conference on Information Sciences and Systems (CISS).

[42]  Bhaskar D. Rao,et al.  Sparse solutions to linear inverse problems with multiple measurement vectors , 2005, IEEE Transactions on Signal Processing.

[43]  Yonina C. Eldar,et al.  Distributed Compressed Sensing for Static and Time-Varying Networks , 2013, IEEE Transactions on Signal Processing.

[44]  Yanting Ma,et al.  Compressed Sensing via Universal Denoising and Approximate Message Passing , 2014, ArXiv.

[45]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[46]  Mohammad Ali Maddah-Ali,et al.  Fundamental tradeoff between computation and communication in distributed computing , 2016, ISIT.

[47]  Toby Berger,et al.  Rate distortion theory : a mathematical basis for data compression , 1971 .

[48]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[49]  Ruggero Carli,et al.  Average consensus on networks with quantized communication , 2009 .

[50]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[51]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[52]  Ahmad Beirami,et al.  Optimal Trade-offs in Multi-Processor Approximate Message Passing , 2016, 1601.03790.

[53]  Andrea Montanari,et al.  Message passing algorithms for compressed sensing: I. motivation and construction , 2009, 2010 IEEE Information Theory Workshop on Information Theory (ITW 2010, Cairo).

[54]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[55]  Adel Javanmard,et al.  State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling , 2012, ArXiv.

[56]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[57]  Jong Chul Ye,et al.  Belief propagation for joint sparse recovery , 2011, ArXiv.

[58]  Dongning Guo,et al.  A single-letter characterization of optimal noisy compressed sensing , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[59]  Yanting Ma,et al.  Compressive Imaging via Approximate Message Passing With Image Denoising , 2014, IEEE Transactions on Signal Processing.

[60]  Jie Chen,et al.  Theoretical Results on Sparse Representations of Multiple-Measurement Vectors , 2006, IEEE Transactions on Signal Processing.

[61]  Y. Shtarkov,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[62]  Lawrence Carin,et al.  Bayesian Compressive Sensing , 2008, IEEE Transactions on Signal Processing.

[63]  Robert D. Nowak,et al.  An EM algorithm for wavelet-based image restoration , 2003, IEEE Trans. Image Process..

[64]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Deanna Needell,et al.  Two-Part Reconstruction With Noisy-Sudocodes , 2014, IEEE Transactions on Signal Processing.

[66]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[67]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[68]  David L. Donoho,et al.  The Kolmogorov Sampler , 2002 .

[69]  Andrea Montanari,et al.  Analysis of Belief Propagation for Non-Linear Problems: The Example of CDMA (or: How to Prove Tanaka's Formula) , 2006, 2006 IEEE Information Theory Workshop - ITW '06 Punta del Este.

[70]  Richard E. Blahut,et al.  Computation of channel capacity and rate-distortion functions , 1972, IEEE Trans. Inf. Theory.

[71]  Andrea Montanari,et al.  Graphical Models Concepts in Compressed Sensing , 2010, Compressed Sensing.

[72]  Toshiyuki Tanaka,et al.  A statistical-mechanics approach to large-system analysis of CDMA multiuser detectors , 2002, IEEE Trans. Inf. Theory.

[73]  Marco F. Duarte,et al.  Universal MAP estimation in compressed sensing , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[74]  Andrea Montanari,et al.  Universality in Polytope Phase Transitions and Message Passing Algorithms , 2012, ArXiv.

[75]  Deanna Needell,et al.  CoSaMP: Iterative signal recovery from incomplete and inaccurate samples , 2008, ArXiv.

[76]  R.G. Baraniuk,et al.  Universal distributed sensing via random projections , 2006, 2006 5th International Conference on Information Processing in Sensor Networks.

[77]  Jong Chul Ye,et al.  k‐t FOCUSS: A general compressed sensing framework for high resolution dynamic MRI , 2009, Magnetic resonance in medicine.

[78]  Dror Baron,et al.  Information Complexity and Estimation , 2011, ArXiv.

[79]  Sergio Verdú,et al.  Randomly spread CDMA: asymptotics via statistical physics , 2005, IEEE Transactions on Information Theory.

[80]  Yonina C. Eldar,et al.  Modified distributed iterative hard thresholding , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[81]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[82]  Quanquan Gu,et al.  Communication-efficient Distributed Estimation and Inference for Transelliptical Graphical Models , 2016, 1612.09297.

[83]  Nicolas Macris,et al.  The mutual information in random linear estimation , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[84]  Arian Maleki,et al.  Minimum complexity pursuit , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[85]  Tsachy Weissman,et al.  Block and Sliding-Block Lossy Compression via MCMC , 2012, IEEE Transactions on Communications.

[86]  Robert D. Nowak,et al.  Adaptive sensing for sparse recovery , 2012, Compressed Sensing.

[87]  Junan Zhu,et al.  Performance regions in compressed sensing from noisy measurements , 2013, 2013 47th Annual Conference on Information Sciences and Systems (CISS).

[88]  Yonina C. Eldar,et al.  Distributed approximate message passing for sparse signal recovery , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[89]  Yonina C. Eldar,et al.  Distributed sparse signal recovery for sensor networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[90]  John E. Dennis,et al.  Normal-Boundary Intersection: A New Method for Generating the Pareto Surface in Nonlinear Multicriteria Optimization Problems , 1998, SIAM J. Optim..

[91]  Martin Wattenberg,et al.  Ad click prediction: a view from the trenches , 2013, KDD.

[92]  Mohammad Ali Maddah-Ali,et al.  Coded MapReduce , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[93]  David B. Dunson,et al.  Nonparametric Bayesian Dictionary Learning for Analysis of Noisy and Incomplete Images , 2012, IEEE Transactions on Image Processing.

[94]  Philip Schniter,et al.  Efficient message passing-based inference in the multiple measurement vector problem , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[95]  Richard G. Baraniuk,et al.  Measurement Bounds for Sparse Signal Ensembles via Graphical Models , 2011, IEEE Transactions on Information Theory.

[96]  Dror Baron,et al.  Wiener Filters in Gaussian Mixture Signal Estimation With \(\ell _\infty \) -Norm Error , 2014, IEEE Transactions on Information Theory.

[97]  Sundeep Rangan,et al.  2012 IEEE Statistical Signal Processing Workshop (SSP) A GENERALIZED FRAMEWORK FOR LEARNING AND RECOVERY OF STRUCTURED SPARSE SIGNALS , 2022 .

[98]  Florent Krzakala,et al.  Phase transitions in sparse PCA , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[99]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[100]  Ahmad Beirami,et al.  Performance trade-offs in multi-processor approximate message passing , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[101]  Florent Krzakala,et al.  Statistical physics-based reconstruction in compressed sensing , 2011, ArXiv.

[102]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[103]  David L. Donoho,et al.  The Simplest Solution to an Underdetermined System of Linear Equations , 2006, 2006 IEEE International Symposium on Information Theory.

[104]  Galen Reeves,et al.  The replica-symmetric prediction for compressed sensing with Gaussian matrices is exact , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[105]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[106]  Neri Merhav,et al.  Statistical Physics and Information Theory , 2010, Found. Trends Commun. Inf. Theory.

[107]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[108]  Tsachy Weissman,et al.  Universal Denoising of Discrete-time Continuous-Amplitude Signals , 2006, ISIT.

[109]  Florent Krzakala,et al.  Approximate Message-Passing Decoder and Capacity Achieving Sparse Superposition Codes , 2015, IEEE Transactions on Information Theory.

[110]  Yanting Ma,et al.  Approximate Message Passing Algorithm With Universal Denoising and Gaussian Mixture Learning , 2015, IEEE Transactions on Signal Processing.

[111]  Andrea Montanari,et al.  Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.

[112]  J. Tropp Algorithms for simultaneous sparse approximation. Part II: Convex relaxation , 2006, Signal Process..

[113]  Dmitry M. Malioutov,et al.  A sparse signal reconstruction perspective for source localization with sensor arrays , 2005, IEEE Transactions on Signal Processing.

[114]  Suguru Arimoto,et al.  An algorithm for computing the capacity of arbitrary discrete memoryless channels , 1972, IEEE Trans. Inf. Theory.

[115]  Yoram Bresler,et al.  Subspace Methods for Joint Sparse Recovery , 2010, IEEE Transactions on Information Theory.

[116]  Joel A. Tropp,et al.  Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit , 2006, Signal Process..

[117]  Richard G. Baraniuk,et al.  Minimum Complexity Pursuit for Universal Compressed Sensing , 2012, IEEE Transactions on Information Theory.

[118]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[119]  Bruno A. Olshausen,et al.  Learning Horizontal Connections in a Sparse Coding Model of Natural Images , 2007, NIPS.

[120]  H. Vincent Poor,et al.  Universal Compressed Sensing of Markov Sources , 2014, ArXiv.

[121]  Toby Berger,et al.  Fixed-slope universal lossy data compression , 1997, IEEE Trans. Inf. Theory.

[122]  Florent Krzakala,et al.  Probabilistic reconstruction in compressed sensing: algorithms, phase diagrams, and threshold achieving matrices , 2012, ArXiv.

[123]  Matthias W. Seeger,et al.  Compressed sensing and Bayesian experimental design , 2008, ICML '08.

[124]  E. Kreyszig Introductory Functional Analysis With Applications , 1978 .

[125]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[126]  Chih-Chun Wang,et al.  Multiuser Detection of Sparsely Spread CDMA , 2008, IEEE Journal on Selected Areas in Communications.

[127]  Ahmad Beirami,et al.  Mismatched estimation in large linear systems , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[128]  Gaurav S. Sukhatme,et al.  Connecting the Physical World with Pervasive Networks , 2002, IEEE Pervasive Comput..

[129]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[130]  Tsachy Weissman,et al.  A context quantization approach to universal denoising , 2009, IEEE Trans. Signal Process..

[131]  Jong Chul Ye,et al.  Improved k–t BLAST and k–t SENSE using FOCUSS , 2007, Physics in medicine and biology.

[132]  Marco F. Duarte,et al.  Complexity-adaptive universal signal estimation for compressed sensing , 2014, 2014 IEEE Workshop on Statistical Signal Processing (SSP).

[133]  Jong Chul Ye,et al.  Improving M-SBL for Joint Sparse Recovery Using a Subspace Penalty , 2015, IEEE Transactions on Signal Processing.

[134]  Tsachy Weissman,et al.  An MCMC Approach to Universal Lossy Compression of Analog Sources , 2011, IEEE Transactions on Signal Processing.

[135]  Jong Chul Ye,et al.  Compressive Diffuse Optical Tomography: Noniterative Exact Reconstruction Using Joint Sparsity , 2011, IEEE Transactions on Medical Imaging.