The Conditional Distribution of Goodness-of-Fit Statistics for Discrete Data

Abstract I consider the distribution of Pearson's statistic and of the likelihood-ratio goodness-of-fit statistic for discrete data in the important case where the data are extensive but sparse. It is argued that the appropriate reference distribution is conditional on the sufficient statistic for the unknown regression parameters, β. The first three conditional asymptotic cumulants are derived by Edgeworth expansion, and these are used for the computation of tail probabilities. The principal advantage of the limit considered here, as opposed to the more usual X 2 limit, is that the cell counts need not be large.