Bayes Estimation with Convex Loss

Let $X$ be a generalized random variable taking values in an abstract set $\mathscr{X}$ on which is defined an appropriate $\sigma$-field of subsets. Suppose that the distribution of $X$ depends on a real parameter $\Theta$ and that it is desired to estimate the value of $\Theta$ from an observation on $X$. Let $W(\cdot)$ be a sufficiently smooth, non-negative, symmetric, convex function defined on the real line. Suppose that when the true value of $\Theta$ is $\theta$ and the estimated value is $\delta$, the loss incurred is $W(\theta - \delta)$. For a given prior distribution of $\Theta$ and any $x \epsilon \mathscr{X}$, let $F(\cdot \mid x)$ be the posterior distribution function of $\Theta$ when the observed value of $X$ is $x$. A Bayes estimate, for the given value of $x$, is a number $\delta^\ast$ such that \begin{equation*}\tag{1.1}\int_{-\infty < \theta < \infty} W(\theta - \delta^\ast) dF(\theta \mid x) = \inf{\tt\_}{\infty < \delta < \infty} \int_{-\infty < \theta < \infty} W(\theta - \delta) dF(\theta \mid x).\end{equation*} Thus, for each given $x$, the problem of finding a Bayes estimate reduces to the problem of minimizing the integral \begin{equation*}\tag{1.2}\int_{-\infty < \theta < \infty} W(\theta - \delta) dF(\theta)\end{equation*} where $F(\cdot)$ is a specified distribution function. In Section 2 the solution of this minimization problem is presented and some properties of the minimizing values of $\delta$ are discussed. In Section 3 it is shown that a Bayes estimator $\delta^\ast(\cdot)$ satisfying (1.1) for all $x \epsilon \mathscr{X}$ can be chosen so that it is a measurable function of $x$. In Section 4 the question of evaluating the expectation of $W\lbrack\Theta - \delta^\ast(X)\rbrack$ is considered and lower bounds for this quantity are presented. The special problem in which $W(\cdot)$ is of the form $W(t) = |t|^k, - \infty < t < \infty, k \geqq 1$, is considered in some detail. It is known (see e.g., [1], p. 302) that for $k = 1$ the integral (1.2) is minimized when $\delta$ is a median of the distribution function $F(\cdot)$, and for $k = 2$ it is minimized when $\delta$ is the mean of $F(\cdot)$. The solution of the minimization problem presented in Section 2 is a generalization of these familiar results.