Examining the Joint Effect of Multiple Risk Factors Using Exposure Risk Profiles: Lung Cancer in Nonsmokers

Background Profile regression is a Bayesian statistical approach designed for investigating the joint effect of multiple risk factors. It reduces dimensionality by using as its main unit of inference the exposure profiles of the subjects that is, the sequence of covariate values that correspond to each subject. Objectives We applied profile regression to a case–control study of lung cancer in nonsmokers, nested within the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort, to estimate the combined effect of environmental carcinogens and to explore possible gene–environment interactions. Methods We tailored and extended the profile regression approach to the analysis of case–control studies, allowing for the analysis of ordinal data and the computation of posterior odds ratios. We compared and contrasted our results with those obtained using standard logistic regression and classification tree methods, including multifactor dimensionality reduction. Results Profile regression strengthened previous observations in other study populations on the role of air pollutants, particularly particulate matter ≤ 10 μm in aerodynamic diameter (PM10), in lung cancer for nonsmokers. Covariates including living on a main road, exposure to PM10 and nitrogen dioxide, and carrying out manual work characterized high-risk subject profiles. Such combinations of risk factors were consistent with a priori expectations. In contrast, other methods gave less interpretable results. Conclusions We conclude that profile regression is a powerful tool for identifying risk profiles that express the joint effect of etiologically relevant variables in multifactorial diseases.

[1]  Paolo Vineis,et al.  DNA adducts and lung cancer risk: a prospective study. , 2005, Cancer research.

[2]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[3]  E. Riboli,et al.  Diet and cancer — the European Prospective Investigation into Cancer and Nutrition , 2004, Nature Reviews Cancer.

[4]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[5]  N E Day,et al.  Multi-factor dimensionality reduction applied to a large prospective investigation on gene-gene and gene-environment interactions. , 2006, Carcinogenesis.

[6]  C. Friedenreich,et al.  Physical activity and lung cancer risk in the European Prospective Investigation into Cancer and Nutrition Cohort , 2006, International journal of cancer.

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  D. Abbey,et al.  Long-term concentrations of ambient air pollutants and incident lung cancer in California adults: results from the AHSMOG study.Adventist Health Study on Smog. , 1998, Environmental health perspectives.

[9]  P Vineis,et al.  White blood cell DNA adducts and fruit and vegetable consumption in bladder cancer. , 2000, Carcinogenesis.

[10]  J. Higgins,et al.  Glutathione S-transferase M1 (GSTM1) polymorphisms and lung cancer: a literature-based systematic HuGE review and meta-analysis. , 2008, American journal of epidemiology.

[11]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[12]  F. Perera,et al.  DNA adducts and cancer risk in prospective studies: a pooled analysis and a meta-analysis. , 2008, Carcinogenesis.

[13]  Mary Kathryn Cowles,et al.  Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models , 1996, Stat. Comput..

[14]  Rebecca A Betensky,et al.  A penalized latent class model for ordinal data. , 2007, Biostatistics.

[15]  Jason H. Moore,et al.  Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions , 2003, Bioinform..

[16]  B. Graubard,et al.  Latent Class Analysis of Complex Sample Survey Data , 2002 .

[17]  Sylvia Richardson,et al.  Equivalence of prospective and retrospective models in the Bayesian analysis of case-control studies , 2004 .

[18]  A. S. Foulkes,et al.  Combining genotype groups and recursive partitioning: an application to human immunodeficiency virus type 1 genetics data , 2004 .

[19]  Bert Brunekreef,et al.  Association between mortality and indicators of traffic-related air pollution in the Netherlands: a cohort study , 2002, The Lancet.

[20]  Paolo Vineis,et al.  Expectations and challenges stemming from genome-wide association studies. , 2008, Mutagenesis.

[21]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[22]  W. Ahrens,et al.  Lung cancer risk in nonsmokers and GSTM1 and GSTT1 genetic polymorphism. , 2000, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[23]  F. Clavel-Chapelon,et al.  Anthropometry, Physical Activity, and the Risk of Pancreatic Cancer in the European Prospective Investigation into Cancer and Nutrition , 2006, Cancer Epidemiology Biomarkers & Prevention.

[24]  J. H. Moore,et al.  Multifactor-dimensionality reduction shows a two-locus interaction associated with Type 2 diabetes mellitus , 2004, Diabetologia.

[25]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[26]  Sylvia Richardson,et al.  Bayesian profile regression with an application to the National Survey of Children's Health. , 2010, Biostatistics.

[27]  P Vineis,et al.  XRCC1, XRCC3, XPD gene polymorphisms, smoking and (32)P-DNA adducts in a sample of healthy subjects. , 2001, Carcinogenesis.

[28]  H. Bartsch Studies on biomarkers in cancer etiology and prevention: a summary and challenge of 20 years of interdisciplinary research. , 2000, Mutation research.

[29]  D. Dockery,et al.  An association between air pollution and mortality in six U.S. cities. , 1993, The New England journal of medicine.

[30]  R. Gupta,et al.  Enhanced sensitivity of 32P-postlabeling analysis of aromatic carcinogen:DNA adducts. , 1985, Cancer research.

[31]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[32]  E. Wynder,et al.  Lung cancer in nonsmokers , 1984, Cancer.

[33]  M. Reilly,et al.  MDR and PRP: A Comparison of Methods for High-Order Genotype-Phenotype Associations , 2005, Human Heredity.

[34]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[35]  M. Reilly,et al.  MDR and PRP: A Comparison of Methods for High-Order Genotype-Phenotype Associations , 2005, Human Heredity.

[36]  D. B. Dahl Bayesian Inference for Gene Expression and Proteomics: Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model , 2006 .

[37]  P. Vineis,et al.  Air pollution and cancer: biomarker studies in human populations. , 2005, Carcinogenesis.

[38]  F. Clavel-Chapelon,et al.  DNA repair polymorphisms and cancer risk in non-smokers in a cohort study. , 2006, Carcinogenesis.

[39]  F. Clavel-Chapelon,et al.  Air pollution and risk of lung cancer in a prospective study in Europe , 2006, International journal of cancer.

[40]  F. Clavel-Chapelon,et al.  Genetic susceptibility according to three metabolic pathways in cancers of the lung and bladder and in myeloid leukemias in nonsmokers. , 2007, Annals of oncology : official journal of the European Society for Medical Oncology.