Statistical methods for gene set co-expression analysis

Motivation: The power of a microarray experiment derives from the identification of genes differentially regulated across biological conditions. To date, differential regulation is most often taken to mean differential expression, and a number of useful methods for identifying differentially expressed (DE) genes or gene sets are available. However, such methods are not able to identify many relevant classes of differentially regulated genes. One important example concerns differentially co-expressed (DC) genes. Results: We propose an approach, gene set co-expression analysis (GSCA), to identify DC gene sets. The GSCA approach provides a false discovery rate controlled list of interesting gene sets, does not require that genes be highly correlated in at least one biological condition and is readily applied to data from individual or multiple experiments, as we demonstrate using data from studies of lung cancer and diabetes. Availability: The GSCA approach is implemented in R and available at www.biostat.wisc.edu/∼kendzior/GSCA/. Contact: kendzior@biostat.wisc.edu Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[2]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[3]  Chi V. Dang,et al.  The interplay between MYC and HIF in cancer , 2008, Nature Reviews Cancer.

[4]  Simon Lin,et al.  Methods of microarray data analysis III , 2002 .

[5]  F. Pociot,et al.  No linkage of P187S polymorphism in NAD(P)H: Quinone oxidoreductase (NQO1/DIA4) and type 1 diabetes in the Danish population , 1999, Human mutation.

[6]  Li Mao,et al.  Identification of two distinct tumor-suppressor loci on the long arm of chromosome 10 in small cell lung cancer , 1998, Oncogene.

[7]  Aline Dupont,et al.  Properdin Plays a Protective Role in Polymicrobial Septic Peritonitis1 , 2008, The Journal of Immunology.

[8]  Peter J. Park,et al.  Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes , 2008, Bioinform..

[9]  Katrin Hoffmann,et al.  Gene expression levels assessed by oligonucleotide microarray analysis and quantitative real-time RT-PCR – how well do they correlate? , 2005, BMC Genomics.

[10]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[11]  Yosef Yarden,et al.  ErbB signaling regulates lineage determination of developing pancreatic islet cells in embryonic organ culture. , 2002, Endocrinology.

[12]  Bin Tean Teh,et al.  Inhibition of Mxi1 suppresses HIF-2α-dependent renal cancer tumorigenesis , 2008 .

[13]  A ROTTINO,et al.  A study of the serum properdin levels of patients with malignant tumors , 1958, Cancer.

[14]  J. Keski‐Oja,et al.  Impaired migration and delayed differentiation of pancreatic islet cells in mice lacking EGF-receptors. , 2000, Development.

[15]  M. Newton,et al.  Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis , 2007, 0708.4350.

[16]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[17]  N Blin,et al.  Coexpression pattern of c-myc associated genes in a small cell lung cancer cell line with high steady state c-myc transcription. , 1995, Biochemical and biophysical research communications.

[18]  Sangsoo Kim,et al.  Combining multiple microarray studies and modeling interstudy variation , 2003, ISMB.

[19]  Sangsoo Kim,et al.  Gene expression Differential coexpression analysis using microarray data and its application to human cancer , 2005 .

[20]  A. Komar Single Nucleotide Polymorphisms , 2009, Methods in Molecular Biology™.

[21]  J. Blouin,et al.  PDK 4 in adipocyte glyceroneogenesis Pyruvate dehydrogenase kinase 4 : regulation by thiazolidinediones and implication in glyceroneogenesis in adipose tissue , 2008 .

[22]  G. Gibson,et al.  Microarray Analysis , 2020, Definitions.

[23]  Hans Lehrach,et al.  A comparison of oligonucleotide and cDNA-based microarray systems. , 2004, Physiological genomics.

[24]  Yen-Yi Ho,et al.  Statistical methods for identifying differentially expressed gene combinations. , 2007, Methods in molecular biology.

[25]  Michael A. Langston,et al.  Extracting Gene Networks for Low-Dose Radiation Using Graph Theoretical Algorithms , 2006, PLoS Comput. Biol..

[26]  W. El-Deiry,et al.  Microarray analysis of p53-dependent gene expression in response to hypoxia and DNA damage , 2007, Cancer biology & therapy.

[27]  Kerby Shedden,et al.  Differential Correlation Detects Complex Associations Between Gene Expression and Clinical Outcomes in Lung Adenocarcinomas , 2005 .

[28]  Lu Zhang,et al.  Genetic polymorphisms of GSTT1, GSTM1, and NQO1 genes and diabetes mellitus risk in Chinese population. , 2006, Biochemical and biophysical research communications.

[29]  Andrew B. Nobel,et al.  A statistical framework for testing functional categories in microarray data , 2008, 0803.3881.

[30]  T. Barrette,et al.  Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. , 2002, Cancer research.

[31]  Liang Chen,et al.  A statistical method for identifying differential gene-gene co-expression patterns , 2004, Bioinform..

[32]  E. Lander,et al.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[33]  J. Litz,et al.  Lck associates with and is activated by Kit in a small cell lung cancer cell line: inhibition of SCF-mediated growth by the Src family kinase inhibitor PP1. , 1998, Cancer research.

[34]  Giovanni Parmigiani,et al.  A Cross-Study Comparison of Gene Expression Studies for the Molecular Classification of Lung Cancer , 2004, Clinical Cancer Research.

[35]  Hinrich W. H. Göhlmann,et al.  The high-level similarity of some disparate gene expression measures , 2007, Bioinform..

[36]  Rainer Spang,et al.  Finding disease specific alterations in the co-expression of genes , 2004, ISMB/ECCB.

[37]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[38]  Richard M Leahy,et al.  High-throughput imaging of brain gene expression. , 2002, Genome research.

[39]  Antoinette Wetterwald,et al.  BMP7, a putative regulator of epithelial homeostasis in the human prostate, is a potent inhibitor of prostate cancer bone metastasis in vivo. , 2007, The American journal of pathology.

[40]  S. Bergmann,et al.  Comparative Gene Expression Analysis by a Differential Clustering Approach: Application to the Candida albicans Transcription Program , 2005, PLoS genetics.

[41]  Andrew B. Nobel,et al.  Significance analysis of functional categories in gene expression studies: a structured permutation approach , 2005, Bioinform..

[42]  K. Wellen,et al.  Inflammation, stress, and diabetes. , 2005, The Journal of clinical investigation.

[43]  Peter Tontonoz,et al.  NR4A orphan nuclear receptors are transcriptional regulators of hepatic glucose metabolism , 2006, Nature Medicine.

[44]  Michael Watson,et al.  CoXpress: differential co-expression in gene expression data , 2006, BMC Bioinformatics.

[45]  Young Il Kim,et al.  Insulin Regulation of Skeletal Muscle PDK4 mRNA Expression Is Impaired in Acute Insulin-Resistant States , 2006, Diabetes.

[46]  S. Horvath,et al.  Conservation and evolution of gene coexpression networks in human and chimpanzee brains , 2006, Proceedings of the National Academy of Sciences.

[47]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[48]  Jing Wang,et al.  Merging microarray data, robust feature selection, and predicting prognosis in prostate cancer , 2006, Cancer informatics.

[49]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[50]  Raimund Hirschberg,et al.  Bone Morphogenetic Protein-7 Signals Opposing Transforming Growth Factor β in Mesangial Cells* , 2004, Journal of Biological Chemistry.

[51]  D. Botstein,et al.  Diversity of gene expression in adenocarcinoma of the lung , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[52]  Y. Yamashita,et al.  Identification of Lck-derived peptides applicable to anti-cancer vaccine for patients with human leukocyte antigen-A3 supertype alleles , 2007, British Journal of Cancer.

[53]  S. Finkelstein,et al.  Malignant Blue Nevus: A Case Report and Molecular Analysis , 2003, The American Journal of dermatopathology.

[54]  Mario Medvedovic,et al.  LRpath: a logistic regression approach for identifying enriched biological groups in gene expression data , 2009, Bioinform..

[55]  N. Sarvetnick,et al.  Expression of ErbB receptors during pancreatic islet development and regrowth. , 2000, The Journal of endocrinology.

[56]  P. Dimitrova,et al.  Properdin Deficiency in Murine Models of Nonseptic Shock1 , 2008, The Journal of Immunology.

[57]  Arthur R. Brothman,et al.  Mutation of the MXI1 gene in prostate cancer , 1995, Nature Genetics.

[58]  Robert Gentleman,et al.  Using GOstats to test gene lists for GO term association , 2007, Bioinform..

[59]  Eric E Schadt,et al.  Cycle Regulation in Islets with Diabetes Susceptibility a Gene Expression Network Model of Type 2 Diabetes Links Cell P

, 2008 .

[60]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[61]  Hyo Jung Lee,et al.  Single nucleotide polymorphisms of the TGFB1 gene and lung cancer risk in a Korean population. , 2006, Cancer genetics and cytogenetics.

[62]  S. Petersen,et al.  Allelic loss on chromosome 10q in human lung cancer: association with tumour progression and metastatic phenotype. , 1998, British Journal of Cancer.

[63]  Joseph L Evans,et al.  Are oxidative stress-activated signaling pathways mediators of insulin resistance and beta-cell dysfunction? , 2003, Diabetes.

[64]  Wensheng Yan,et al.  Targeted repression of bone morphogenetic protein 7, a novel target of the p53 family, triggers proliferative defect in p53-deficient breast cancer cells. , 2007, Cancer research.

[65]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[66]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[67]  K. Itoh,et al.  Identification of Lck‐derived peptides capable of inducing HLA‐A2‐restricted and tumor‐specific CTLs in cancer patients with distant metastases , 2001, International journal of cancer.

[68]  H. Yamana,et al.  Recognition of the Lck tyrosine kinase as a tumor antigen by cytotoxic T lymphocytes of cancer patients with distant metastases , 2001, European journal of immunology.

[69]  V. Nosikov,et al.  Lack of association between genetic markers on chromosome 16q22-Q24 and type 1 diabetes in Russian affected families. , 2005, Croatian medical journal.

[70]  Robert Gentleman,et al.  On the synthesis of microarray experiments , 2005 .

[71]  Yuan Ji,et al.  Extracting three-way gene interactions from microarray data , 2007, Bioinform..

[72]  Christopher Logothetis,et al.  Inhibition of Mxi1 suppresses HIF-2alpha-dependent renal cancer tumorigenesis. , 2008, Cancer biology & therapy.