An insight on the use of multiple logistic regression analysis to estimate association between risk factor and disease occurrence.

Multiple logistic regression is an accepted statistical method for assessing association between an anticedant characteristic (risk factor) and a quantal outcome (probability of disease occurrence), statistically adjusting for potential confounding effects of other covariates. Yet the method has potential drawbacks which are not generally recognized. This article considers one important drawback of logistic regression. Specifically the so-called main effect logistic model assumes that the probability of developing disease is linearly and additively related to the risk factors on the logistic scale. This assumption stipulates that for each risk factor, the odds ratio is constant over all reference exposure levels, and that the odds ratio exposed to two or more factors is equal to the product of individual risk factor odds ratios. If the observed odds ratios in the data follow this pattern, the model-predicted odds ratios will be accurate, and the meaning of the odds ratio for each risk factor will be straightforward. But if the observed odds ratios deviate from the model assumption, the model will not fit the data accurately, and the model-predicted odds ratios will not reflect those in the data. Although satisfactory fit can always be achieved by adding to the model polynomial and product terms derived from the original risk factors, the odds ratios estimated by such an interaction logistic model are difficult to interpret, viz., the odds ratio for each risk factor depends not only on the reference exposure levels of that factor, but also on the exposure level in other factors.(ABSTRACT TRUNCATED AT 250 WORDS)