Estimating probabilities from experimental frequencies.

Estimating the probability distribution q governing the behavior of a certain variable by sampling its value a finite number of times most typically involves an error. Successive measurements allow the construction of a histogram, or frequency count f, of each of the possible outcomes. In this work, the probability that the true distribution be q, given that the frequency count f was sampled, is studied. Such a probability may be written as a Gibbs distribution. A thermodynamic potential, which allows an easy evaluation of the mean Kullback-Leibler divergence between the true and measured distribution, is defined. For a large number of samples, the expectation value of any function of q is expanded in powers of the inverse number of samples. As an example, the moments, the entropy, and the mutual information are analyzed.