Identification and Prioritization of Relationships between Environmental Stressors and Adverse Human Health Impacts

Background There are > 80,000 chemicals in commerce with few data available describing their impacts on human health. Biomonitoring surveys, such as the NHANES (National Health and Nutrition Examination Survey), offer one route to identifying possible relationships between environmental chemicals and health impacts, but sparse data and the complexity of traditional models make it difficult to leverage effectively. Objective We describe a workflow to efficiently and comprehensively evaluate and prioritize chemical–health impact relationships from the NHANES biomonitoring survey studies. Methods Using a frequent itemset mining (FIM) approach, we identified relationships between chemicals and health biomarkers and diseases. Results The FIM method identified 7,848 relationships between 219 chemicals and 93 health outcomes/biomarkers. Two case studies used to evaluate the FIM rankings demonstrate that the FIM approach is able to identify published relationships. Because the relationships are derived from the vast majority of the chemicals monitored by NHANES, the resulting list of associations is appropriate for evaluating results from targeted data mining or identifying novel candidate relationships for more detailed investigation. Conclusions Because of the computational efficiency of the FIM method, all chemicals and health effects can be considered in a single analysis. The resulting list provides a comprehensive summary of the chemical/health co-occurrences from NHANES that are higher than expected by chance. This information enables ranking and prioritization on chemicals or health effects of interest for evaluation of published results and design of future studies. Citation Bell SM, Edwards SW. 2015. Identification and prioritization of relationships between environmental stressors and adverse human health impacts. Environ Health Perspect 123:1193–1199; http://dx.doi.org/10.1289/ehp.1409138

[1]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[2]  Shannon M. Bell,et al.  Building associations between markers of environmental stressors and adverse human health impacts using frequent itemset mining , 2014, SDM.

[3]  Yu-Mei Tan,et al.  Uses of NHANES Biomarker Data for Chemical Risk Assessment: Trends, Challenges, and Opportunities , 2015, Environmental health perspectives.

[4]  P. Barry Lead levels in blood , 1975, Nature.

[5]  Kurt Hornik,et al.  Introduction to arules – A computational environment for mining association rules and frequent item sets , 2009 .

[6]  Chris Gennings,et al.  Linking empirical estimates of body burden of environmental chemicals and wellness using NHANES data. , 2012, Environment international.

[7]  Atul J Butte,et al.  Systematic evaluation of environmental factors: persistent pollutants and nutrients correlated with serum lipid levels , 2012, International journal of epidemiology.

[8]  Raphael Oliveira Ramos Franco Netto SERUM-LIPID LEVELS , 1965 .

[9]  John P A Ioannidis,et al.  Placing epidemiological results in the context of multiplicity and typical correlations of exposures , 2014, Journal of Epidemiology & Community Health.

[10]  Christian Borgelt,et al.  EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .

[11]  M. Feinleib National Center for Health Statistics (NCHS) , 2005 .

[12]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[13]  R. Judson,et al.  The Toxicity Data Landscape for Environmental Chemicals , 2008, Environmental health perspectives.

[14]  Atul J. Butte,et al.  An Environment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus , 2010, PloS one.

[15]  Christian Borgelt,et al.  Induction of Association Rules: Apriori Implementation , 2002, COMPSTAT.

[16]  S. Sarkar Some Results on False Discovery Rate in Stepwise multiple testing procedures , 2002 .

[17]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[18]  James Vail,et al.  The exposure data landscape for manufactured chemicals. , 2012, The Science of the total environment.

[19]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[20]  Christian Borgelt,et al.  Frequent item set mining , 2012, WIREs Data Mining Knowl. Discov..

[21]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[22]  Rong Chen,et al.  Data-driven integration of epidemiological and toxicological data to select candidate interacting genes and environmental factors in association with disease , 2012, Bioinform..

[23]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[24]  Xuefeng Liu,et al.  Examination of the relationships between environmental exposures to volatile organic compounds and biochemical liver tests: application of canonical correlation analysis. , 2009, Environmental research.

[25]  Nimrod Megiddo,et al.  Discovering Predictive Association Rules , 1998, KDD.