CoGA: An R Package to Identify Differentially Co-Expressed Gene Sets by Analyzing the Graph Spectra

Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their “importance” in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed.

[1]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[2]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[3]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[4]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[5]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[6]  R. Berkowitz,et al.  Bcl-2 and p53 protein expression, apoptosis, and p53 mutation in human epithelial ovarian cancers. , 2000, The American journal of pathology.

[7]  João Ricardo Sato,et al.  Discriminating Different Classes of Biological Networks by Analyzing the Graphs Spectra Distribution , 2012, PloS one.

[8]  A. Fuente,et al.  From ‘differential expression’ to ‘differential networking’ – identification of dysfunctional regulatory networks in diseases , 2010 .

[9]  Antonio Reverter,et al.  A Differential Wiring Analysis of Expression Data Correctly Identifies the Gene Containing the Causal Mutation , 2009, PLoS Comput. Biol..

[10]  Michael Watson,et al.  CoXpress: differential co-expression in gene expression data , 2006, BMC Bioinformatics.

[11]  A. Merlo,et al.  Deltex-1 Activates Mitotic Signaling and Proliferation and Increases the Clonogenic and Invasive Potential of U373 and LN18 Glioblastoma Cells and Correlates with Patient Survival , 2013, PloS one.

[12]  Rainer Breitling,et al.  DiffCoEx: a simple and sensitive method to find differentially coexpressed gene modules , 2010, BMC Bioinformatics.

[13]  Tao Chen,et al.  Notch1 promotes glioma cell migration and invasion by stimulating β‐catenin and NF‐κB signaling via AKT activation , 2012, Cancer science.

[14]  Frank Emmert-Streib,et al.  Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets , 2013, Bioinform..

[15]  A. G. de la Fuente From 'differential expression' to 'differential networking' - identification of dysfunctional regulatory networks in diseases. , 2010, Trends in genetics : TIG.

[16]  Piet Van Mieghem,et al.  Graph Spectra for Complex Networks , 2010 .

[17]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[18]  Q. Su,et al.  Expression of Notch-1 and its ligands, Delta-like-1 and Jagged-1, is critical for glioma cell survival and proliferation. , 2005, Cancer research.

[19]  Ron Shamir,et al.  Dissection of Regulatory Networks that Are Altered in Disease via Differential Co-expression , 2013, PLoS Comput. Biol..

[20]  Hui Yu,et al.  Link-based quantitative methods to identify differentially coexpressed genes and gene Pairs , 2011, BMC Bioinformatics.

[21]  Hans Skovgaard Poulsen,et al.  The functional role of Notch signaling in human gliomas. , 2010, Neuro-oncology.

[22]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[23]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[24]  Eric E Schadt,et al.  Cycle Regulation in Islets with Diabetes Susceptibility a Gene Expression Network Model of Type 2 Diabetes Links Cell P

, 2008 .

[25]  K. Pearson NOTES ON THE HISTORY OF CORRELATION , 1920 .

[26]  Hui Yu,et al.  Bioinformatics Applications Note Gene Expression Dcgl: an R Package for Identifying Differentially Coexpressed Genes and Links from Gene Expression Microarray Data , 2022 .

[27]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[28]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[29]  Christina Kendziorski,et al.  Statistical methods for gene set co-expression analysis , 2009, Bioinform..

[30]  Toshio Nikaido,et al.  Expression of Replication-Licensing Factors MCM2 and MCM3 in Normal, Hyperplastic, and Carcinomatous Endometrium: Correlation With Expression of Ki-67 and Estrogen and Progesterone Receptors , 2003, International journal of gynecological pathology : official journal of the International Society of Gynecological Pathologists.

[31]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[32]  Herbert A. Sturges,et al.  The Choice of a Class Interval , 1926 .

[33]  Zhongming Zhao,et al.  DCGL v2.0: An R Package for Unveiling Differential Regulation from Differential Co-expression , 2013, PloS one.