Variable Selection and Generalized Chi-Square Analysis of Categorical Data Applied to a Large Cross-Sectional Occupational Health Survey

Medical practitioners in the cotton textile industry are particularly interested in the relationship between employee complaints of the respiratory ailment byssinosis and the variables sex, race, length of employment, smoking habit, and dustiness of work area. Recently, a cotton textile corporation surveyed the respiratory system health of its employees. The analysis of byssinosis prevalence for the 5419 employees proceeds in two stages. In the first stage, variables are screened on the basis of contributed variability via Pearson chi-square statistics and MantelHaenszel statistics. The second stage involves model fitting, via the method developed by Grizzle, Starmer, and Koch for the analysis of categorical data, to the variables selected in the first stage so that these variables' effects on byssinosis prevalence is explained with a minimum number of underlying parameters.