The Role of Information Theory in Emission Tomography

In emission tomography, useful for studying brain function, a source of radioactivity is ingested by a person, say as sugar, and a Poisson number, n(b), of radioactive emissions arises in each box (pixel), b, of the brain depending on the brain activity there, and X ( b ) = En(b) is sought. Each emission (not directly observable) makes an independent Markovian transition to some detector unit, d, with probability p ( b , d) , C d p ( b , d ) = 1, where p ( b , d ) is known from the geometry and performance of the detectors. We measure n*(d), the total number of counts in each d and wish to estimate X ( b ) to get an image of the brain activity, say during counting, speaking, or other function. For each X there is a likelihood (see Appendix A), A(A), to observe n" and one popular approach to reconstructing or estimating X is to seek a maximum likelihood estimator (MLE). Surprisingly enough, ideas of information theory have provided useful insight into the theoretical understanding of MLE even though entropy doesn't appear to be directly involved. Noone knows how to produce an MLE directly but the so-called EM algorithm is used beginning with an initial Xo to produce ever more likely X1, X 2 . . , estimates. to a limit maximizing A(X) is heavily information theoretic. Unfortunately this limiting MLE was seen [2] not to be a robust estimate due to the fact that n(b) is small and hence statistically noisy and indeed was totally useless as a practical image. If MLE were not unique then the various ML estimators could be averaged, and since A(X) is seen to be log concave (see Appendix A), an estimate could be obtained which is both smooth as well as maximally likely. On empirical grounds it was conjectured [2] in 1988 that MLE was, under general conditions, unique. Very recently, again using ideas of information theory, Charles L. Byrne, succeeded [3] to formulate a general and natural hypothesis on p ( b , d ) under which the conjecture is true. This dashes all hope that smooth MLE's exist in practical emission tomography. The present approaches involve either stopping the iteration early, smoothing at each step or at the end, or maximizing posterior likelihood with a Gibbs prior. I hope information theory will continue to shed light on emission tomography. The only rigorous proof [l] of convergence of