Factor graphs and the sum-product algorithm

Algorithms that must deal with complicated global functions of many variables often exploit the manner in which the given functions factor as a product of "local" functions, each of which depends on a subset of the variables. Such a factorization can be visualized with a bipartite graph that we call a factor graph, In this tutorial paper, we present a generic message-passing algorithm, the sum-product algorithm, that operates in a factor graph. Following a single, simple computational rule, the sum-product algorithm computes-either exactly or approximately-various marginal functions derived from the global function. A wide variety of algorithms developed in artificial intelligence, signal processing, and digital communications can be derived as specific instances of the sum-product algorithm, including the forward/backward algorithm, the Viterbi algorithm, the iterative "turbo" decoding algorithm, Pearl's (1988) belief propagation algorithm for Bayesian networks, the Kalman filter, and certain fast Fourier transform (FFT) algorithms.

[1]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[2]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[3]  Umberto Bertelè,et al.  Nonserial Dynamic Programming , 1972 .

[4]  C. Preston Gibbs States on Countable Sets , 1974 .

[5]  John Cocke,et al.  Optimal decoding of linear codes for minimizing symbol error rate (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[6]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[8]  Robert Michael Tanner,et al.  A recursive approach to low complexity codes , 1981, IEEE Trans. Inf. Theory.

[9]  V. Isham An Introduction to Spatial Point Processes and Markov Random Fields , 1981 .

[10]  B. Anderson,et al.  Optimal Filtering , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[11]  Kenneth H. Rosen Discrete mathematics and its applications , 1984 .

[12]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[13]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[14]  S. Verdú,et al.  Abstract dynamic programming models under commutativity conditions , 1987 .

[15]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[17]  Mario Stefanelli,et al.  Contribution to the discussion of the paper by Steffen L. Lauritzen and David Spiegelhalter: "Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems" , 1988 .

[18]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[19]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[20]  Jan C. Willems,et al.  Models for Dynamics , 1989 .

[21]  A. Glavieux,et al.  Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1 , 1993, Proceedings of ICC '93 - IEEE International Conference on Communications.

[22]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[23]  Hans-Andrea Loeliger,et al.  Codes and iterative decoding on general graphs , 1995, Eur. Trans. Telecommun..

[24]  David J. C. MacKay,et al.  Good Codes Based on Very Sparse Matrices , 1995, IMACC.

[25]  Brendan J. Frey,et al.  Probability Propagation and Iterative Decoding , 1996 .

[26]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[27]  Sergio Benedetto,et al.  Iterative decoding of serially concatenated convolutional codes , 1996 .

[28]  Joachim Hagenauer,et al.  Iterative decoding of binary block and convolutional codes , 1996, IEEE Trans. Inf. Theory.

[29]  Niclas Wiberg,et al.  Codes and Decoding on General Graphs , 1996 .

[30]  Peter Dayan,et al.  A simple algorithm that discovers efficient perceptual codes , 1997 .

[31]  Robert J. McEliece,et al.  A general algorithm for distributing information in a graph , 1997, Proceedings of IEEE International Symposium on Information Theory.

[32]  Jung-Fu Cheng,et al.  Turbo Decoding as an Instance of Pearl's "Belief Propagation" Algorithm , 1998, IEEE J. Sel. Areas Commun..

[33]  Brendan J. Frey,et al.  Graphical Models for Machine Learning and Digital Communication , 1998 .

[34]  Dariush Divsalar,et al.  Coding theorems for 'turbo-like' codes , 1998 .

[35]  Brendan J. Frey,et al.  Iterative Decoding of Compound Codes by Probability Propagation in Graphical Models , 1998, IEEE J. Sel. Areas Commun..

[36]  David J. C. MacKay,et al.  Good Error-Correcting Codes Based on Very Sparse Matrices , 1997, IEEE Trans. Inf. Theory.

[37]  Brendan J. Frey,et al.  Variational Learning in Nonlinear Gaussian Belief Networks , 1999, Neural Computation.

[38]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[39]  G. Forney,et al.  Codes on graphs: normal realizations , 2000, 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060).

[40]  G.D. Forney,et al.  Codes on graphs: Normal realizations , 2000, IEEE Trans. Inf. Theory.

[41]  Roberto Garello,et al.  Interleaver properties and their applications to the trellis complexity analysis of turbo codes , 2001, IEEE Trans. Commun..

[42]  S. Yau Mathematics and its applications , 2002 .