The role of likelihood and entropy in incomplete-data problems: Applications to estimating point-process intensities and toeplitz constrained covariances

The principle of maximum entropy has played an important role in the solution of problems in which the measurements correspond to moment constraints on some many-to-one mapping h(x). In this paper we explore its role in estimation problems in which the measured data are statistical observations and moment constraints on the observation function h(x) do not exist. We conclude that: 1) For the class of likelihood problems arising in a complete-incomplete data context in which the complete data x are nonuniquely determined by the measured incomplete data y via the many-to-one mapping y = h(x), the density maximizing entropy is identical to the conditional density of the complete data given the incomplete data. This equivalence results by viewing the measurements as specifying the domain over which the density is defined, rather than as a moment constraint on h(x). 2) The identity between the maximum entropy and the conditional density results in the fact that maximum-likelihood estimates may be obtained via a joint maximization (minimization) of the entropy function (Kullback-Liebler divergence). This provides the basis for the iterative algorithm of Dempster, Laird, and Rubin [1] for the maximization of likelihood functions. 3) This iterative method is used for maximum-likelihood estimation of image parameters in emission tomography and gammaray astronomy. We demonstrate that unconstrained likelihood estimation of image intensities from finite data sets yields unstable estimates. We show how Grenander's method of sieves can be used with the iterative algorithm to remove the instability. A bandwidth sieve is introduced resulting in an estimator which is smoothed via exponential splines. 4) We also derive a recursive algorithm for the generation of Toeplitz constrained maximum-likelihood estimators which at each iteration evaluates conditional mean estimates of the lag products based on the previous estimate of the covariance, from which the updated Toeplitz covariance is generated. We prove that the sequence of Toeplitz estimators has the property that they increase in likelihood, remain in the set of positive-definite Toeplitz covariances, and has all of its limit points stable and satisfying the necessary conditions for maximizing the likelihood.

[1]  Donald L. Snyder,et al.  An Evaluation of an Improved Method for Computing Histograms in Dynamic Tracer Studies Using Positron-Emission Tomography , 1986, IEEE Transactions on Nuclear Science.

[2]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[3]  B. Frieden Restoring with maximum likelihood and maximum entropy. , 1972, Journal of the Optical Society of America.

[4]  J. Burg Estimation of Structured Covariance Matrices a Generalization of the Burg Technique , 1983 .

[5]  I. Rhodes A tutorial introduction to estimation and filtering , 1971 .

[6]  Michael I. Miller,et al.  The Use of Sieves to Stabilize Images Produced with the EM Algorithm for Emission Tomography , 1985, IEEE Transactions on Nuclear Science.

[7]  Stephen J. Wernecke,et al.  Maximum Entropy Image Reconstruction , 1977, IEEE Transactions on Computers.

[8]  J. P. Burg,et al.  Maximum entropy spectral analysis. , 1967 .

[9]  R. Tapia,et al.  Nonparametric Probability Density Estimation , 1978 .

[10]  K. B. Larson,et al.  Maximum-likelihood estimation applied to electron microscopic autoradiography , 1985 .

[11]  Stephen M. Moore,et al.  An Evaluation of the Use of Sieves for Producing Estimates. Of Radioactivity Distributions with the EM Algorithm for PET , 1986, IEEE Transactions on Nuclear Science.

[12]  M. Miller,et al.  Maximum-Likelihood Reconstruction for Single-Photon Emission Computed-Tomography , 1985, IEEE Transactions on Nuclear Science.

[13]  L. Shepp,et al.  A Statistical Model for Positron Emission Tomography , 1985 .

[14]  Donald G. Childers,et al.  Modern Spectrum Analysis , 1978 .

[15]  K. Lange,et al.  EM reconstruction algorithms for emission and transmission tomography. , 1984, Journal of computer assisted tomography.

[16]  B. Roy Frieden,et al.  Restoring with maximum entropy. III. Poisson sources and backgrounds , 1978 .

[17]  L. Shepp,et al.  Maximum Likelihood PET with Real Data , 1984, IEEE Transactions on Nuclear Science.

[18]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[19]  E. Jaynes On the rationale of maximum-entropy methods , 1982, Proceedings of the IEEE.

[20]  Kazumi Murata,et al.  Maximum entropy image reconstruction from projections , 1981 .

[21]  Stuart Geman,et al.  Sieves for Nonparametric Estimation of Densities and Regressions. , 1981 .

[22]  Jan M. Van Campenhout,et al.  Maximum entropy and conditional probability , 1981, IEEE Trans. Inf. Theory.

[23]  Andrew W. Strong,et al.  Maximum-entropy image processing in gamma-ray astronomy , 1979 .

[24]  Donald L. Snyder,et al.  Image Reconstruction from List-Mode Data in an Emission Tomography System Having Time-of-Flight Measurements , 1983, IEEE Transactions on Nuclear Science.

[25]  L. J. Thomas,et al.  A Matheematical Model for Positron-Emission Tomography Systems Having Time-of-Flight Measurements , 1981, IEEE Transactions on Nuclear Science.

[26]  J. Shore Minimum cross-entropy spectral analysis , 1981 .

[27]  I. Good Nonparametric roughness penalties for probability densities , 1971 .

[28]  Bruce R. Musicus,et al.  Iterative algorithms for optimal signal reconstruction and parameter identification given noisy and incomplete data , 1983, ICASSP.

[29]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[30]  R. A. Gaskins,et al.  Nonparametric roughness penalties for probability densities , 2022 .

[31]  M. Miller,et al.  Algorithms for removing recovery-related distortion from auditory-nerve discharge patterns. , 1985, The Journal of the Acoustical Society of America.

[32]  Adriaan van den Bos,et al.  Alternative interpretation of maximum entropy spectral analysis (Corresp.) , 1971, IEEE Trans. Inf. Theory.

[33]  Thomas M. Cover,et al.  An algorithm for maximizing expected log investment return , 1984, IEEE Trans. Inf. Theory.

[34]  S. Gull,et al.  Image reconstruction from incomplete and noisy data , 1978, Nature.