Application of biclustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials

Background: The presence of diverse types of nanomaterials (NMs) in commerce is growing at an exponential pace. As a result, human exposure to these materials in the environment is inevitable, necessitating the need for rapid and reliable toxicity testing methods to accurately assess the potential hazards associated with NMs. In this study, we applied biclustering and gene set enrichment analysis methods to derive essential features of altered lung transcriptome following exposure to NMs that are associated with lung-specific diseases. Several datasets from public microarray repositories describing pulmonary diseases in mouse models following exposure to a variety of substances were examined and functionally related biclusters of genes showing similar expression profiles were identified. The identified biclusters were then used to conduct a gene set enrichment analysis on pulmonary gene expression profiles derived from mice exposed to nano-titanium dioxide (nano-TiO2), carbon black (CB) or carbon nanotubes (CNTs) to determine the disease significance of these data-driven gene sets. Results: Biclusters representing inflammation (chemokine activity), DNA binding, cell cycle, apoptosis, reactive oxygen species (ROS) and fibrosis processes were identified. All of the NM studies were significant with respect to the bicluster related to chemokine activity (DAVID; FDR p-value = 0.032). The bicluster related to pulmonary fibrosis was enriched in studies where toxicity induced by CNT and CB studies was investigated, suggesting the potential for these materials to induce lung fibrosis. The pro-fibrogenic potential of CNTs is well established. Although CB has not been shown to induce fibrosis, it induces stronger inflammatory, oxidative stress and DNA damage responses than nano-TiO2 particles. Conclusion: The results of the analysis correctly identified all NMs to be inflammogenic and only CB and CNTs as potentially fibrogenic. In addition to identifying several previously defined, functionally relevant gene sets, the present study also identified two novel genes sets: a gene set associated with pulmonary fibrosis and a gene set associated with ROS, underlining the advantage of using a data-driven approach to identify novel, functionally related gene sets. The results can be used in future gene set enrichment analysis studies involving NMs or as features for clustering and classifying NMs of diverse properties.

[1]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Pan Du,et al.  lumi: a pipeline for processing Illumina microarray , 2008, Bioinform..

[3]  Allan Peter Davis,et al.  Genetic and environmental pathways to complex diseases , 2009, BMC Systems Biology.

[4]  H. Takano,et al.  Carbon black nanoparticles enhance bleomycin-induced lung inflammatory and fibrotic changes in mice , 2011, Experimental biology and medicine.

[5]  May D. Wang,et al.  GoMiner: a resource for biological interpretation of genomic and proteomic data , 2003, Genome Biology.

[6]  Nicola J. Rinaldi,et al.  Computational discovery of gene modules and regulatory networks , 2003, Nature Biotechnology.

[7]  Daniel Q. Naiman,et al.  Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data , 2005, Bioinform..

[8]  Nicklas Raun Jacobsen,et al.  Transcriptional profiling identifies physicochemical properties of nanomaterials that are determinants of the in vivo pulmonary response , 2015, Environmental and molecular mutagenesis.

[9]  Brad T. Sherman,et al.  The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists , 2007, Genome Biology.

[10]  E. Lander,et al.  A molecular signature of metastasis in primary solid tumors , 2003, Nature Genetics.

[11]  Shyam Biswal,et al.  Cigarette smoke-induced emphysema in A/J mice is associated with pulmonary oxidative stress, apoptosis of lung cells, and global alterations in gene expression. , 2009, American journal of physiology. Lung cellular and molecular physiology.

[12]  Roded Sharan,et al.  Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[13]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[14]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[15]  Andrew Williams,et al.  Environmental and Molecular Mutagenesis 52:425^439 (2011) Research Article Pulmonary Response to Surface-Coated Nanotitanium Dioxide Particles Includes Induction of Acute Phase Response Genes, Inflammatory Cascades, and Changes in MicroRNAs: A Toxicogenom , 2022 .

[16]  G. Wu,et al.  FIZZ2/RELM-β Induction and Role in Pulmonary Fibrosis , 2011, The Journal of Immunology.

[17]  David Bryant,et al.  DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists , 2007, Nucleic Acids Res..

[18]  Dongmei Wu,et al.  Transcriptomic Analysis Reveals Novel Mechanistic Insight into Murine Biological Responses to Multi-Walled Carbon Nanotubes in Lungs and Cultured Lung Epithelial Cells , 2013, PloS one.

[19]  John Quackenbush,et al.  attract: A Method for Identifying Core Pathways That Define Cellular Phenotypes , 2011, PloS one.

[20]  Peng Xiao,et al.  Hotelling’s T 2 multivariate profiling for detecting differential expression in microarrays , 2005 .

[21]  T. Barrette,et al.  Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. , 2002, Cancer research.

[22]  马建新,et al.  用FEV6.0代替FVC诊断气道阻塞和肺功能受限[英]/Swanney MP…∥Am J Respir Crit Care Med. , 2002 .

[23]  Tao Feng,et al.  An efficient method to identify differentially expressed genes in microarray experiments , 2008, Bioinform..

[24]  D. Voehringer,et al.  Disease-specific gene expression profiling in multiple models of lung disease. , 2008, American journal of respiratory and critical care medicine.

[25]  Andrew Williams,et al.  MWCNTs of different physicochemical properties cause similar inflammatory responses, but differences in transcriptional and histological markers of fibrosis in mouse lungs. , 2015, Toxicology and applied pharmacology.

[26]  宁北芳,et al.  疟原虫var基因转换速率变化导致抗原变异[英]/Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A , 2005 .

[27]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[28]  Jenny L Zheng,et al.  Gene batteries and synexpression groups applied in a multivariate statistical approach to dose-response analysis of toxicogenomic data. , 2013, Regulatory toxicology and pharmacology : RTP.

[29]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[30]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[31]  Xintao Wei,et al.  Mining Functionally Relevant Gene Sets for Analyzing Physiologically Novel Clinical Expression Data , 2011, Pacific Symposium on Biocomputing.

[32]  Carla M. T. Bauer,et al.  Bleomycin Induces Molecular Changes Directly Relevant to Idiopathic Pulmonary Fibrosis: A Model for “Active” Disease , 2013, PloS one.

[33]  Errol M. Thomson,et al.  Overexpression of tumor necrosis factor-α in the lungs alters immune response, matrix remodeling, and repair and maintenance pathways. , 2012, The American journal of pathology.

[34]  D. Cavalieri,et al.  Fundamentals of cDNA microarray data analysis. , 2003, Trends in genetics : TIG.

[35]  P. Sperryn,et al.  Blood. , 1989, British journal of sports medicine.

[36]  Russell S. Thomas,et al.  Application of genomic biomarkers to predict increased lung tumor incidence in 2-year rodent cancer bioassays. , 2007, Toxicological sciences : an official journal of the Society of Toxicology.

[37]  Andrew D Maynard,et al.  The new toxicology of sophisticated materials: nanotoxicology and beyond. , 2011, Toxicological sciences : an official journal of the Society of Toxicology.

[38]  Kai Li,et al.  Exploring the functional landscape of gene expression: directed search of large microarray compendia , 2007, Bioinform..

[39]  Andrew Williams,et al.  Hepatic and Pulmonary Toxicogenomic Profiles in Mice Intratracheally Instilled With Carbon Black Nanoparticles Reveal Pulmonary Inflammation, Acute Phase Response, and Alterations in Lipid Homeostasis , 2012, Toxicological sciences : an official journal of the Society of Toxicology.

[40]  Federica Toffalini,et al.  Transcription factor regulation can be accurately predicted from the presence of target gene signatures in microarray gene expression data , 2010, Nucleic acids research.

[41]  T. M. Murali,et al.  Extracting Conserved Gene Expression Motifs from Gene Expression Data , 2002, Pacific Symposium on Biocomputing.

[42]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[43]  Ming Wu,et al.  Gene module level analysis: identification to networks and dynamics. , 2008, Current opinion in biotechnology.

[44]  Steven C. Lawlor,et al.  MAPPFinder: using Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data , 2003, Genome Biology.

[45]  Haifeng Li,et al.  Systematic discovery of functional modules and context-specific functional annotation of human genome , 2007, ISMB/ECCB.

[46]  Willem A. Kamps,et al.  A New Perspective on Transcriptional System Regulation (TSR): Towards TSR Profiling , 2008, PloS one.

[47]  Milind B. Suraokar,et al.  Radiation-Enhanced Lung Cancer Progression in a Transgenic Mouse Model of Lung Cancer Is Predictive of Outcomes in Human Lung and Breast Cancer , 2014, Clinical Cancer Research.

[48]  Michael D Waters,et al.  Case study on the utility of hepatic global gene expression profiling in the risk assessment of the carcinogen furan. , 2014, Toxicology and applied pharmacology.

[49]  Russ B. Altman,et al.  Using Pre-existing Microarray Datasets to Increase Experimental Power: Application to Insulin Resistance , 2010, PLoS Comput. Biol..

[50]  S. Auerbach,et al.  Differential Transcriptomic Analysis of Spontaneous Lung Tumors in B6C3F1 Mice: Comparison to Human Non–Small Cell Lung Cancer , 2012, Toxicologic pathology.

[51]  Andrew B. Nobel,et al.  Significance analysis of functional categories in gene expression studies: a structured permutation approach , 2005, Bioinform..

[52]  Oliver Eickelberg,et al.  Cigarette smoke-induced iBALT mediates macrophage activation in a B cell-dependent manner in COPD. , 2014, American journal of physiology. Lung cellular and molecular physiology.

[53]  Lu Zhang,et al.  Large scale real-time PCR validation on gene expression measurements from two commercial long-oligonucleotide microarrays , 2006, BMC Genomics.

[54]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[55]  Russ B. Altman,et al.  Independent component analysis: Mining microarray data for fundamental human gene expression modules , 2010, J. Biomed. Informatics.

[56]  S. Moghaddam,et al.  Interleukin 6, but Not T Helper 2 Cytokines, Promotes Lung Carcinogenesis , 2010, Cancer Prevention Research.

[57]  Douglas A. Hosack,et al.  Identifying biological themes within lists of genes with EASE , 2003, Genome Biology.

[58]  Yong Qian,et al.  Multiwalled Carbon Nanotube-Induced Gene Signatures in the Mouse Lung: Potential Predictive Value for Human Lung Cancer Risk and Prognosis , 2012, Journal of toxicology and environmental health. Part A.

[59]  Modified FDR Controlling Procedure for Multi-Stage Analyses , 2009, Statistical applications in genetics and molecular biology.

[60]  G. Churchill,et al.  Statistical design and the analysis of gene expression microarray data. , 2007, Genetical research.

[61]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[62]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[63]  Sven Bergmann,et al.  Defining transcription modules using large-scale gene expression data , 2004, Bioinform..

[64]  Monica L Guzman,et al.  Discovery of agents that eradicate leukemia stem cells using an in silico screen of public gene expression data. , 2008, Blood.

[65]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[66]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[67]  A. Sweet-Cordero,et al.  Loss of p130 accelerates tumor development in a mouse model for human small-cell lung carcinoma. , 2010, Cancer research.

[68]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[69]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[70]  R. Tibshirani,et al.  Disease signatures are robust across tissues and experiments , 2009, Molecular systems biology.

[71]  U. Vogel,et al.  Pulmonary instillation of low doses of titanium dioxide nanoparticles in mice leads to particle retention and gene expression changes in the absence of inflammation. , 2013, Toxicology and applied pharmacology.

[72]  Charles L Geraci,et al.  Challenges in assessing nanomaterial toxicology: a personal perspective. , 2010, Wiley interdisciplinary reviews. Nanomedicine and nanobiotechnology.

[73]  C. Ouzounis,et al.  Expansion of the BioCyc collection of pathway/genome databases to 160 genomes , 2005, Nucleic acids research.

[74]  Purvesh Khatri,et al.  Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-Translate , 2003, Nucleic Acids Res..

[75]  P. Finn,et al.  Hubs in biological interaction networks exhibit low changes in expression in experimental asthma , 2007, Molecular systems biology.

[76]  David Y Lai,et al.  Toward toxicity testing of nanomaterials in the 21st century: a paradigm for moving forward. , 2012, Wiley interdisciplinary reviews. Nanomedicine and nanobiotechnology.

[77]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .