Classification tree analysis: a statistical tool to investigate risk factor interactions with an example for colon cancer (United States)

Objective: Classification tree analysis is a potentially powerful tool for investigating multilevel interactions. Within the context of colon cancer etiology it may help identify disease pathways and evaluate important interactions of risk factors. Methods: We apply classification tree analysis as a statistical method to investigate interactions of risk factors for colon cancer. We use data collected from a population-based case–control study of newly diagnosed cases of colon cancer (N = 4403 cases and controls). Results: Our results indicate that, as expected, there are many factors that influence colon cancer risk, and that they interact on many levels. We find that the most important factor is the utilization of aspirin and/or non-steroidal anti-inflammatory drugs (NSAID), with those taking this medication having lower risk. Family history appears as a level two modifying factor when NSAID are not used, whereas Western diet is the second factor when NSAID are taken. The final tree has six levels, contains several modifying factors and correctly classifies case or control status for 60.8% (95% CI 59.4–62.2) of all individuals. Conclusions: Our results suggest that risk factors work together to determine disease risk. By accounting for interactions between risk factors we become better able to dissect disease pathways and determine those risk factors that increase susceptibility to disease. Our results highlight the importance of designing studies so that interactions can be addressed.

[1]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[2]  David Biggs,et al.  A method of choosing multiway partitions for classification and decision trees , 1991 .

[3]  J. Potter,et al.  Colon cancer: a review of the epidemiology. , 1993, Epidemiologic reviews.

[4]  T D Berry,et al.  Objective system for interviewer performance evaluation for use in epidemiologic studies. , 1994, American journal of epidemiology.

[5]  T D Berry,et al.  A computerized diet history questionnaire for epidemiologic studies. , 1994, Journal of the American Dietetic Association.

[6]  M. Slattery,et al.  Family history of cancer and colon cancer risk: the Utah Population Database. , 1994, Journal of the National Cancer Institute.

[7]  D. Jacobs,et al.  Assessment of ability to recall physical activity of several years ago. , 1995, Annals of epidemiology.

[8]  W. Willett,et al.  Leisure-Time Physical Activity, Body Size, and Colon Cancer in Women , 1997 .

[9]  J. Potter,et al.  Energy balance and colon cancer--beyond physical activity. , 1997, Cancer research.

[10]  J. Potter,et al.  Dietary sugar and colon cancer. , 1997, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[11]  J. Potter,et al.  Tobacco use and colon cancer , 1997, International journal of cancer.

[12]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[13]  J. Potter,et al.  Physical activity and colon cancer: a public health perspective. , 1997, Annals of epidemiology.

[14]  John D. Potter,et al.  Food, nutrition and the prevention of cancer : a global perspective , 2001 .

[15]  W. Willett,et al.  Leisure-time physical activity, body size, and colon cancer in women. Nurses' Health Study Research Group. , 1997, Journal of the National Cancer Institute.

[16]  M. Slattery,et al.  Body size and the risk of colon cancer in a large case-control study , 1998, International Journal of Obesity.

[17]  J. Potter,et al.  Drugs and colon cancer , 1998, Pharmacoepidemiology and drug safety.

[18]  J. Potter,et al.  Risk of colon cancer associated with a family history of cancer or colorectal polyps: The Diet, Activity, and Reproduction in Colon Cancer Study , 1998, International journal of cancer.

[19]  D C Rao,et al.  CAT scans, PET scans, and genomic scans , 1998, Genetic epidemiology.

[20]  J. Potter,et al.  Eating patterns and risk of colon cancer. , 1998, American journal of epidemiology.

[21]  Burton H. Singer,et al.  Recursive partitioning in the health sciences , 1999 .

[22]  L. Kolonel,et al.  Independent and joint effects of family history and lifestyle on colorectal cancer risk: implications for prevention. , 1999, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[23]  M. Leppert,et al.  Meat consumption, genetic susceptibility, and colon cancer risk: a United States multicenter case-control study. , 1999, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[24]  S. Kimmel,et al.  Accuracy of initial stroke subtype diagnosis. a decision analysis. , 2000, Cerebrovascular diseases.

[25]  R D Morgan,et al.  Do the DSM decision trees improve diagnostic ability? , 2000, Journal of clinical psychology.

[26]  A. Detsky,et al.  Computer-assisted decision analysis in orthopedics: resurfacing the patella in total knee arthroplasty as an example. , 2000, The Journal of arthroplasty.

[27]  S. Kimmel,et al.  Accuracy of Initial Stroke Subtype Diagnosis , 2000, Cerebrovascular Diseases.

[28]  F. Wolfe,et al.  Modeling therapeutic strategies in rheumatoid arthritis: use of decision analysis and Markov models. , 2000, The Journal of rheumatology.

[29]  P. Carroll,et al.  Management of a positive surgical margin after radical prostatectomy: decision analysis. , 2000, The Journal of urology.

[30]  J. Baron,et al.  Nonsteroidal anti-inflammatory drugs and cancer prevention. , 2000, Annual review of medicine.

[31]  Heping Zhang,et al.  Use of classification trees for association studies , 2000, Genetic epidemiology.

[32]  John D Potter,et al.  Physical activity and colon cancer: confounding or interaction? , 2002, Medicine and science in sports and exercise.

[33]  J. Potter,et al.  Hormone replacement therapy, reproductive history, and colon cancer: a multicenter, case-control study in the United States , 1997, Cancer Causes & Control.