Estimation of Linear Functions of Cell Proportions

In this article certain contributions are made to the theory of estimating linear functions of cell proportions in connection with the methods of (1) least squares, (2) minimum chi-square, and (3) maximum likelihood. Distinctions among these three methods made by previous writers arise out of (1) confusion concerning theoretical vs. practical weights, (2) neglect of effects of correlation between sampling errors, and (3) disagreement concerning methods of minimization. Throughout the paper the equivalence of these three methods from a practical point of view has been emphasized in order to facilitate the integration and adaptation of existing statistical techniques. To this end: 1. The method of least squares as derived by Gauss in 1821-23 [6, pp. 224228] in which weights in theory are chosen so as to minimize sampling variances is herein called the ideal method of least squares and the theoretical estimates are called ideal linear estimates. This approach avoids confusion between practical approximations and theoretical exact weights. 2. The ideal' method of least squares is applied to uncorrelated linear functions of correlated sample frequencies to determine the appropriate quantity to minimize in order to derive ideal linear estimates in sample-frequency problems. This approach leads to a sum of squares of standardized uncorrelated linear functions of sampling errors in which statistics are to be substituted in numerators. 3. A new elementary method is used to reduce the sum of squares in (2)before substitution of statistics-to Pearson's expression for chi-square. In this result, obtained without approximation, appropriate substitution of statistics shows that the denominators of chi-square should be treated as constant parameters in the differentiation process in order to minimize chi-square in conformity with the ideal method of least squares. 4. The ideal method of minimum chi-square, derived in (3) as the samplefrequency form of the ideal method of least squares, yields ideal linear estimates in terms of the unknown parameters in the denominators of chi-square. When these parameters are estimated by successive approximations in such a way as to be consistent with statistics based on them, it is shown that the method of minimum chi-square leads to maximum likelihood statistics. 5. An iterative method which converges to maximum likelihood estimates is developed for the case in which observations are cross-classified and first order totals are known. In comparison with Deming's asymptotically efficient statistics, it is shown that, in a certain sense, maximum likelihood statistics are suDerior for any given value of n-especially in small samples. 6. The method of proportional distribution of marginal adjustments is de-