The least weighted deviation

Abstract The probability distribution p is found that minimizes the weighted deviation from a given probability distribution q when the range and the mean value of a random variable are known. A particular weighted deviation is taken into account only to the extent to which the optimization problem has a unique and calculable solution. The commonest known measures of derivation (Pearson's chi-square, the Kullback-Leibler divergence, etc.) are reobtained as functional variants of the deviations used in a Euclidean normed space. An adequate choice of the weights may improve the accuracy in predicting the unknown probability distribution. In the 19th century, R. Wolf tossed an unfair die 20,000 times and obtained a probability distribution yielding a mean value of the number of pips equal to μ = 3.5983. Knowing μ, can we predict Wolf's distribution after tossing his die only 1,000 or fewer times? The choice of the weights in the weighted deviation used in prediction proves to be essential even in the case when these weights depend only on the given probability distribution q.