Evaluation of the maximum-likelihood estimator where the likelihood equation has multiple roots.

Even under the usual regularity conditions, where a unique consistent root of the likelihood equation is known to exist, it is often not possible to obtain an explicit solution for the maximum-likelihood estimate (M.L.E.) of a parameter as a function of the sample. In such cases it is necessary to use numerical methods to evaluate the M.L.E. by successive iteration. This was first discussed in the statistical literature by Fisher (1925) who advocated the use of a method known as 'scoring for parameters'. He suggested that '...starting with an inefficient statistic, a single process of approximation will in ordinary cases give an efficient statistic differing from the maximum-likelihood solution, by a quantity which with increasing samples decreases as n-l', where n is the sample size; and concluded that one iteration is therefore sufficient in a practical sense in such 'ordinary cases'. Norton (1956) has used a particular example in genetics described by Fisher (1950, chapter 9) to show that this is not true, but that several iterations may be necessary for reasonable convergence. A variety of numerical methods are available for locating the root of an equation, of which the method of 'scoring for parameters' is but a single example of the Newtonian approach to the problem. Kale (1961) has discussed several of these methods (the fixed-derivative Newton, Newton-Raphson and 'scoring for parameters' methods) for obtaining the M.L.E. of a single parameter under the usual regularity conditions, from the point of view of whether or not they satisfy certain desirable probabilistic properties as n -> co. He states: 'In effect, it is shown that the iteration processes usually applied in practice are justifiable, in large samples at least'. In a subsequent paper, Kale (1962) makes a similar study for the multi-parameter case. The relative merits of the various methods in terms of their order of convergence have been extensively discussed by the numerical analysts, together with the need to effect a compromise between the order of convergence and the practical effort required to apply the different methods (see e.g. Hamming, 1962). In any practical problem, however, we are not necessarily concerned with relative orders of convergence, utility of application or asymptotic probabilistic properties of the different methods. Given a single sample of observations x1, x2, ..., Xn, of fixed finite size n, from a distribution with parameter 0, we wish to evaluate the M.L.E. of Ofor that sample. Regularity conditions and the associated existence of a unique consistent root are no guarantee that a single root of the likelihood equation will exist for this sample. In fact there will often exist multiple roots, corresponding to multiple relative maxima of the likelihood function, even if the regularity conditions are satisfied. The results described above do not consider this effect specifically, either because (as in the case of Kale, 1961) the author is not basically concerned with finite samples, or (as Norton, 1956) particular examples discussed quite fortuitously have a unique root for the likelihood equation. In general, then, we have a more fundamental problem of whether or not a particular