The Estimation of the Lorenz Curve and Gini Index

M OST of the measures of income inequality are derived from the Lorenz curve; indeed Morgan (1962) states that the Gini index is the best single measure of inequality. The present article reviews some of the theoretical properties of the Lorenz curve, relates them to characteristics of the frequency function underlying the income distribution and develops methods for obtaining accurate bounds on the Gini index which do not depend on curve fitting. In the process we should also like to lay to rest some myths concerning the Gini index such as: (a) its relative insensitivity (Rltet6 and Frigyes, 1968), (b) difficulty in computation (1968), and (c) problems related to the inclusion of negative incomes (Budd, 1970). The basic idea of our approach is to obtain upper and lower bounds to the Gini index from data which are grouped in intervals and the mean income in each interval is known. The usual method (Morgan, 1962) of estimating the Gini index yields a lower bound by assuming that all incomes in any interval equal the average income. We derive an upper bound to the grouping correction (Goldsmith, et al., 1954, p. 10) and hence to the Gini index by distributing the income to maximize the spread within each group. On the 1967 Internal Revenue Service tax data, the difference between our bounds is less than 0.006. As most income distributions come from a frequency function (density) which decreases in the large income range, we develop improved bounds for the Gini index based on this assumption. Fortunately, this assumption can be checked from the data so that we can use the sharper bounds only for the appropriate intervals. Using this second method the difference between our bounds is ? .002. Because Soltow (1965) detects a change in the Gini index of 0.8 of one per cent or about 0.003 or 0.004, our bound seems quite adequate. In section VI we extend our method to obtain upper and lower curves for the Lorenz curve. After reviewing the basic properties of the Lorenz curve we proceed to derive bounds on the mean difference and Gini index. In section IV we analyze an actual sample and show that the method used by the Census Bureau (1967) often leads to estimates which are outside the mathematically possible bounds we derived. Finally, in an appendix we show that the Pareto law does not give a good fit to current United States tax data.