A TWO-STEP SMOOTHING PROCEDURE FOR THE ANALYSIS OF SPARSE CONTINGENCY TABLES WITH ORDERED CATEGORIES

Assessing the multivariate structure of data is often the aim of the statistical analysis of economical, demographic and social phenomena. In many situations in the analysis of categorical data it may happen that the number of cells can be close to, or even greater than, the number of observations at hand resulting in very small or even zero cell counts. In this case a contingency table is usually referred to as a sparse table. In this sort of situation the optimal properties of the usual statistical procedures may break down. Several authors investigated the use of smoothing methods for sparse count data but a little was done to evaluate if these methods can be helpful in discovering the multivariate structure of the data. This paper shows as the joint use of smoothing techniques and information measures may improve the analysis in a multivariate sparse context.

[1]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[2]  Jeffrey S. Simonoff,et al.  Smoothing categorical data , 1995 .

[3]  Alan Agresti,et al.  An empirical investigation of some effects of sparseness in contingency tables , 1987 .

[4]  E. Soofi Capturing the Intangible Concept of Information , 1994 .

[5]  A. Agresti Analysis of Ordinal Categorical Data , 1985 .

[6]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[7]  Thomas J. DiCiccio,et al.  On Smoothing and the Bootstrap , 1989 .

[8]  Shelby J. Haberman,et al.  Log-Linear Models and Frequency Tables with Small Expected Cell Counts , 1977 .

[9]  Jonathan J. Forster,et al.  Monte Carlo exact conditional tests for log-linear and logistic models , 1996 .

[10]  Paul Janssen,et al.  Smoothing sparse multinomial data using local polynomial fitting , 1997 .

[11]  S. Kullback,et al.  The Information in Contingency Tables , 1980 .

[12]  S. Kullback,et al.  The Information in Contingency Tables - An Application of Information- Theoretic Concepts to the Analysis of Contingency Tables , 1976 .

[13]  Takis Papaioannou,et al.  Measures of Information , 2004 .

[14]  Klaus Krippendorff,et al.  Information Theory: Structural Models for Qualitative Data. , 1988 .

[15]  M. E. Johnson,et al.  Multivariate Statistical Simulation , 1988 .

[16]  Zhi Geng,et al.  Algorithm AS R87: A Remark on Algorithm AS 185: Automatic Model Selection in Contingency Tables , 1991 .

[17]  Jeffrey S. Simonoff,et al.  Jackknifing and Bootstrapping Goodness-of-Fit Statistics in Sparse Multinomials , 1986 .

[18]  J. Simonoff Smoothing Methods in Statistics , 1998 .

[19]  C. H. Sim,et al.  Generation of poisson and gamma random vectors with given marginals and covariance matrix , 1993 .

[20]  Timothy R. C. Read,et al.  Goodness-Of-Fit Statistics for Discrete Multivariate Data , 1988 .

[21]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[22]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[23]  D. M. Titterington,et al.  On Smoothing Sparse Multinomial Data , 1987 .

[24]  Alan Agresti,et al.  Exact Inference for Contingency Tables with Ordered Categories , 1990 .

[25]  D. M. Titterington,et al.  Cross-validation in nonparametric estimation of probabilities and probability densities , 1984 .

[26]  Fw Fred Steutel,et al.  Discrete analogues of self-decomposability and stability , 1979 .

[27]  M. C. Jones,et al.  Simple boundary correction for kernel density estimation , 1993 .

[28]  G. Lovison,et al.  The effect of marginal disuniformity on the k 2 approximation to the distribution of Pearson's K 2 in sparse contingency tables , 1993 .

[29]  Jeffrey S. Simonoff,et al.  A Penalty Function Approach to Smoothing Large Sparse Contingency Tables , 1983 .

[30]  Jeffrey S. Simonoff,et al.  The Construction and Properties of Boundary Kernels for Smoothing Sparse Multinomials , 1994 .

[31]  Riccardo Borgoni,et al.  Nonparametric Estimation Methods for Sparse Contingency Tables , 2001 .

[32]  A geometric combination estimator for d-dimensional ordinal sparse contingency tables , 1995 .