A Generalization of the Blahut–Arimoto Algorithm to Finite-State Channels

The classical Blahut-Arimoto algorithm (BAA) is a well-known algorithm that optimizes a discrete memoryless source (DMS) at the input of a discrete memoryless channel (DMC) in order to maximize the mutual information between channel input and output. This paper considers the problem of optimizing finite-state machine sources (FSMSs) at the input of finite-state machine channels (FSMCs) in order to maximize the mutual information rate between channel input and output. Our main result is an algorithm that efficiently solves this problem numerically; thus, we call the proposed procedure the generalized BAA. It includes as special cases not only the classical BAA but also an algorithm that solves the problem of finding the capacity-achieving input distribution for finite-state channels with no noise. While we present theorems that characterize the local behavior of the generalized BAA, there are still open questions concerning its global behavior; these open questions are addressed by some conjectures at the end of the paper. Apart from these algorithmic issues, our results lead to insights regarding the local conditions that the information-rate-maximizing FSMSs fulfill; these observations naturally generalize the well-known Kuhn-Tucker conditions that are fulfilled by capacity-achieving DMSs at the input of DMCs.

[1]  Axthonv G. Oettinger,et al.  IEEE Transactions on Information Theory , 1998 .

[2]  Raymond W. Yeung,et al.  A First Course in Information Theory , 2002 .

[3]  Hans-Andrea Loeliger,et al.  On the information rate of binary-input channels with memory , 2001, ICC 2001. IEEE International Conference on Communications. Conference Record (Cat. No.01CH37240).

[4]  Xiao Ma,et al.  Matched information rate codes for Partial response channels , 2005, IEEE Transactions on Information Theory.

[5]  Pascal O. Vontobel,et al.  A generalized Blahut-Arimoto algorithm , 2003, IEEE International Symposium on Information Theory, 2003. Proceedings..

[6]  Hans-Andrea Loeliger A posteriori probabilities and performance evaluation of trellis codes , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[7]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[8]  Shlomo Shamai,et al.  The intersymbol interference channel: lower bounds on capacity and channel precoding loss , 1996, IEEE Trans. Inf. Theory.

[9]  Paul H. Siegel,et al.  Codes for Digital Recorders , 1998, IEEE Trans. Inf. Theory.

[10]  David L. Neuhoff,et al.  Coding for channels with cost constraints , 1996, IEEE Trans. Inf. Theory.

[11]  Pascal O. Vontobel,et al.  An upper bound on the capacity of channels with memory and constraint input , 2001, Proceedings 2001 IEEE Information Theory Workshop (Cat. No.01EX494).

[12]  G. David Forney,et al.  Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference , 1972, IEEE Trans. Inf. Theory.

[13]  Pravin Varaiya,et al.  Capacity, mutual information, and coding for finite-state Markov channels , 1996, IEEE Trans. Inf. Theory.

[14]  Shlomo Shamai,et al.  Worst-case power-constrained noise for binary-input channels , 1992, IEEE Trans. Inf. Theory.

[15]  V. Sharma,et al.  Entropy and channel capacity in the regenerative setup with applications to Markov channels , 2001, Proceedings. 2001 IEEE International Symposium on Information Theory (IEEE Cat. No.01CH37252).

[16]  Hans-Andrea Loeliger,et al.  The binary jitter channel: a new model for magnetic recording , 2000, 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060).

[17]  Walter Hirt Capacity and information rates of discrete-time channels with memory , 1988 .

[18]  Laurent Mevel,et al.  Exponential Forgetting and Geometric Ergodicity in Hidden Markov Models , 2000, Math. Control. Signals Syst..

[19]  G. Forney,et al.  Codes on graphs: normal realizations , 2000, 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060).

[20]  Michael Mitzenmacher,et al.  Capacity Approaching Signal Constellations for Channels with Memory , 2001 .

[21]  Larry A. Wasserman,et al.  Iterative Markov Chain Monte Carlo Computation of Reference Priors and Minimax Risk , 2001, UAI.

[22]  Jack K. Wolf,et al.  On runlength codes , 1988, IEEE Trans. Inf. Theory.

[23]  Paul H. Siegel,et al.  On the low-rate Shannon limit for binary intersymbol interference channels , 2003, IEEE Trans. Commun..

[24]  Sekhar Tatikonda,et al.  Feedback capacity of finite-state machine channels , 2005, IEEE Transactions on Information Theory.

[25]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[26]  J. O’Sullivan Alternating Minimization Algorithms: From Blahut-Arimoto to Expectation-Maximization , 1998 .

[27]  M. Opper,et al.  An Idiosyncratic Journey Beyond Mean Field Theory , 2001 .

[28]  H.-A. Loeliger,et al.  An introduction to factor graphs , 2004, IEEE Signal Process. Mag..

[29]  J. Yedidia An Idiosyncratic Journey Beyond Mean Field Theory , 2000 .

[30]  Laurent Mevel,et al.  Asymptotical statistics of misspecified hidden Markov models , 2004, IEEE Transactions on Automatic Control.

[31]  Paul H. Siegel,et al.  On the achievable information rates of finite state ISI channels , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[32]  Aleksandar Kavcic,et al.  Markov sources achieve the feedback capacity of finite-state machine channels , 2002, Proceedings IEEE International Symposium on Information Theory,.

[33]  Israel Bar-David,et al.  Capacity and coding for the Gilbert-Elliot channels , 1989, IEEE Trans. Inf. Theory.

[34]  John Cocke,et al.  Optimal decoding of linear codes for minimizing symbol error rate (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[35]  Suguru Arimoto,et al.  An algorithm for computing the capacity of arbitrary discrete memoryless channels , 1972, IEEE Trans. Inf. Theory.

[36]  J.E. Mazo,et al.  Digital communications , 1985, Proceedings of the IEEE.

[37]  A. Patapoutian,et al.  Signal-dependent autoregressive channel model , 1999, IEEE International Magnetics Conference.

[38]  Shlomo Shamai,et al.  On the capacity of binary and Gaussian channels with run-length-limited inputs , 1990, IEEE Trans. Commun..

[39]  Wei Zeng,et al.  Simulation-Based Computation of Information Rates for Channels With Memory , 2006, IEEE Transactions on Information Theory.

[40]  H. Thapar,et al.  A class of partial response systems for increasing storage density in magnetic recording , 1987 .

[41]  Tom Høholdt,et al.  Maxentropic Markov chains , 1984, IEEE Trans. Inf. Theory.

[42]  Aaron D. Wyner,et al.  Some Geometrical Results in Channel CapacityNachrichtentechnische Zeit, vol. 10. 1957. , 1993 .

[43]  Aleksandar Kavcic On the capacity of Markov sources over noisy channels , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[44]  J. L. Holsinger,et al.  DIGITAL COMMUNICATION OVER FIXED TIMECONTINUOUS CHANNELS WITH MEMORY, WITH SPECIAL APPLICATION TO TELEPHONE CHANNELS, , 1964 .

[45]  Shlomo Shamai,et al.  Information rates for a discrete-time Gaussian channel with intersymbol interference and stationary inputs , 1991, IEEE Trans. Inf. Theory.

[46]  D. A. Bell,et al.  Information Theory and Reliable Communication , 1969 .

[47]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[48]  Richard E. Blahut,et al.  Computation of channel capacity and rate-distortion functions , 1972, IEEE Trans. Inf. Theory.

[49]  Paul H. Siegel,et al.  Markov Processes Asymptotically Achieve the Capacity of Finite-State Intersymbol Interference Channels , 2004, IEEE Transactions on Information Theory.

[50]  D. Hunter,et al.  Optimization Transfer Using Surrogate Objective Functions , 2000 .

[51]  Laurent Mevel,et al.  Basic Properties of the Projective Product with Application to Products of Column-Allowable Nonnegative Matrices , 2000, Math. Control. Signals Syst..

[52]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .