CluGene: A Bioinformatics Framework for the Identification of Co-Localized, Co-Expressed and Co-Regulated Genes Aimed at the Investigation of Transcriptional Regulatory Networks from High-Throughput Expression Data

The full understanding of the mechanisms underlying transcriptional regulatory networks requires unravelling of complex causal relationships. Genome high-throughput technologies produce a huge amount of information pertaining gene expression and regulation; however, the complexity of the available data is often overwhelming and tools are needed to extract and organize the relevant information. This work starts from the assumption that the observation of co-occurrent events (in particular co-localization, co-expression and co-regulation) may provide a powerful starting point to begin unravelling transcriptional regulatory networks. Co-expressed genes often imply shared functional pathways; co-expressed and functionally related genes are often co-localized, too; moreover, co-expressed and co-localized genes are also potential targets for co-regulation; finally, co-regulation seems more frequent for genes mapped to proximal chromosome regions. Despite the recognized importance of analysing co-occurrent events, no bioinformatics solution allowing the simultaneous analysis of co-expression, co-localization and co-regulation is currently available. Our work resulted in developing and valuating CluGene, a software providing tools to analyze multiple types of co-occurrences within a single interactive environment allowing the interactive investigation of combined co-expression, co-localization and co-regulation of genes. The use of CluGene will enhance the power of testing hypothesis and experimental approaches aimed at unravelling transcriptional regulatory networks. The software is freely available at http://bioinfolab.unipg.it/.

[1]  P. Michalak Coexpression, coregulation, and cofunctionality of neighboring genes in eukaryotic genomes. , 2008, Genomics.

[2]  Yuri Y. Shevelyov,et al.  Large clusters of co-expressed genes in the Drosophila genome , 2002, Nature.

[3]  J. Collado-Vides,et al.  Transcriptional regulation constrains the organization of genes on eukaryotic chromosomes , 2008, Proceedings of the National Academy of Sciences.

[4]  H. Stein,et al.  Down-regulation of BOB.1/OBF.1 and Oct2 in classical Hodgkin disease but not in lymphocyte predominant Hodgkin disease correlates with immunoglobulin transcription. , 2001, Blood.

[5]  M. Busslinger,et al.  The promoter of the CD19 gene is a target for the B-cell-specific transcription factor BSAP , 1992, Molecular and cellular biology.

[6]  Louxin Zhang,et al.  Genome-scale analysis of positional clustering of mouse testis-specific genes , 2005, BMC Genomics.

[7]  Nicola Senin,et al.  Gepoclu: a software tool for identifying and analyzing gene positional clusters in large-scale gene expression analysis , 2011, BMC Bioinformatics.

[8]  H. Stein,et al.  Hodgkin and reed-sternberg cells represent an expansion of a single clone originating from a germinal center B-cell with functional immunoglobulin gene rearrangements but defective immunoglobulin transcription. , 2000, Blood.

[9]  Mathieu Raffinot,et al.  Gene teams: a new formalization of gene clusters for comparative genomics , 2003, Comput. Biol. Chem..

[10]  C. Lawrence,et al.  Human-mouse genome comparisons to locate regulatory sites , 2000, Nature Genetics.

[11]  G. Church,et al.  A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression , 2000, Nature Genetics.

[12]  J. A. Chan,et al.  The B-cell transcription factors BSAP, Oct-2, and BOB.1 and the pan-B-cell markers CD20, CD22, and CD79a are useful in the differential diagnosis of classic Hodgkin lymphoma. , 2003, American journal of clinical pathology.

[13]  S. Pileri,et al.  Monoclonal antibodies PG-B6a and PG-B6p recognize, respectively, a highly conserved and a formol-resistant epitope on the human BCL-6 protein amino-terminal region. , 1996, The American journal of pathology.

[14]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[15]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[16]  H Stein,et al.  A monoclonal antibody (MUM1p) detects expression of the MUM1/IRF4 protein in a subset of germinal center B cells, plasma cells, and activated T cells. , 2000, Blood.

[17]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[18]  X. Dai,et al.  Nuclear colocalization of transcription factor target genes strengthens coregulation in yeast , 2011, Nucleic acids research.

[19]  R. Siebert,et al.  Mutation of an IKK phosphorylation site within the transactivation domain of REL in two patients with B-cell lymphoma enhances REL's in vitro transforming activity , 2007, Oncogene.

[20]  Xin He,et al.  Identifying Conserved Gene Clusters in the Presence of Homology Families , 2005, J. Comput. Biol..

[21]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[22]  A. Sharpe,et al.  The ikaros gene is required for the development of all lymphoid lineages , 1994, Cell.

[23]  Nick Gilbert,et al.  The role of chromatin structure in regulating the expression of clustered genes , 2005, Nature Reviews Genetics.

[24]  Eugene V Koonin,et al.  Evolution of genome architecture. , 2009, The international journal of biochemistry & cell biology.

[25]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[26]  C. Pál,et al.  The evolutionary dynamics of eukaryotic gene order , 2004, Nature Reviews Genetics.

[27]  E. J. Williams,et al.  Coexpression of neighboring genes in the genome of Arabidopsis thaliana. , 2004, Genome research.

[28]  S. Brunak,et al.  SignalP 4.0: discriminating signal peptides from transmembrane regions , 2011, Nature Methods.

[29]  D. W. Rogers,et al.  A genome-wide analysis in Anopheles gambiae mosquitoes reveals 46 male accessory gland genes, possible modulators of female behavior , 2007, Proceedings of the National Academy of Sciences.

[30]  D. Kalaitzidis,et al.  The c-Rel transcription factor and B-cell proliferation: a deal with the devil , 2004, Oncogene.

[31]  B. Maher ENCODE: The human encyclopaedia , 2012, Nature.

[32]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[33]  A. Strasser,et al.  Mice lacking the c-rel proto-oncogene exhibit defects in lymphocyte proliferation, humoral immunity, and interleukin-2 expression. , 1995, Genes & development.

[34]  Ramón Díaz-Uriarte,et al.  IDconverter and IDClight: Conversion and annotation of gene and protein IDs , 2007, BMC Bioinformatics.

[35]  G. Stormo Consensus patterns in DNA. , 1990, Methods in enzymology.

[36]  Wyeth W. Wasserman,et al.  In silico identification of metazoan transcriptional regulatory regions , 2003, Naturwissenschaften.

[37]  G. Lenz,et al.  Defective octamer-dependent transcription is responsible for silenced immunoglobulin transcription in Reed-Sternberg cells. , 2001, Blood.

[38]  T. Gilmore,et al.  Histone acetyltransferase p300 is a coactivator for transcription factor REL and is C-terminally truncated in the human diffuse large B-cell lymphoma cell line RC-K8. , 2010, Cancer letters.

[39]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[40]  A. Crisanti,et al.  Regulation of Anopheles gambiae male accessory gland genes influences postmating response in female , 2013, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[41]  P. Bucher,et al.  Searching for regulatory elements in human noncoding sequences. , 1997, Current opinion in structural biology.

[42]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[43]  L. Pham,et al.  Constitutive NF-kappaB and NFAT activation in aggressive B-cell lymphomas synergistically activates the CD154 gene and maintains lymphoma cell survival. , 2005, Blood.

[44]  G. Lenz,et al.  Frequent expression of the B-cell-specific activator protein in Reed-Sternberg cells of classical Hodgkin's disease provides further evidence for its B-cell origin. , 1999, Blood.

[45]  J. Lawrence,et al.  Shared Strategies in Gene Organization among Prokaryotes and Eukaryotes , 2002, Cell.

[46]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[47]  J. Delabie,et al.  The transcription factor PU.1, necessary for B-cell development is expressed in lymphocyte predominance, but not classical Hodgkin's disease. , 2001, The American journal of pathology.

[48]  E. Wagner,et al.  Complete block of early B cell differentiation and altered patterning of the posterior midbrain in mice lacking Pax5 BSAP , 1994, Cell.

[49]  P. Gaulard,et al.  Small lymphocytic lymphoma, marginal zone B-cell lymphoma, and mantle cell lymphoma exhibit distinct gene-expression profiles allowing molecular diagnosis. , 2004, Blood.

[50]  G. Church,et al.  Identifying regulatory networks by combinatorial analysis of promoter elements , 2001, Nature Genetics.

[51]  B. Koop,et al.  Human and rodent DNA sequence comparisons: a mosaic model of genomic evolution. , 1995, Trends in genetics : TIG.

[52]  V. Diehl,et al.  Oct-2 and Bob-1 deficiency in Hodgkin and Reed Sternberg cells. , 2001, Cancer research.

[53]  A. Osbourn,et al.  Operons , 2009, Cellular and Molecular Life Sciences.

[54]  Juan F. García,et al.  Analysis of Octamer-Binding Transcription Factors Oct2 and Oct1 and their coactivator BOB.1/OBF.1 in Lymphomas , 2002, Modern Pathology.

[55]  Terence P Speed,et al.  Lineage-specific expansion of proteins exported to erythrocytes in malaria parasites , 2006, Genome Biology.

[56]  Gerald M Rubin,et al.  Evidence for large domains of similarly expressed genes in the Drosophila genome , 2002, Journal of biology.

[57]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[58]  T. Golub,et al.  Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response. , 2004, Blood.

[59]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[60]  M. Vasef,et al.  Expression profiling of transcription factors Pax-5, Oct-1, Oct-2, BOB.1, and PU.1 in Hodgkin's and non-Hodgkin's lymphomas: a comparative study using high throughput tissue microarrays , 2006, Modern Pathology.

[61]  W. Miller,et al.  Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. , 1997, Genome research.

[62]  M. Fraga,et al.  Differential Diagnosis of Classic Hodgkin Lymphoma , 2010, International journal of surgical pathology.

[63]  Hamada,et al.  Expression of the PAX5/BSAP transcription factor in haematological tumour cells and further molecular characterization of the t(9;14)(p13;q32) translocation in B‐cell non‐Hodgkin's lymphoma , 1998, British journal of haematology.

[64]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[65]  Xin He,et al.  Detecting gene clusters under evolutionary constraint in a large number of genomes , 2009, Bioinform..

[66]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[67]  Travis Harrison,et al.  A Host-Targeting Signal in Virulence Proteins Reveals a Secretome in Malarial Infection , 2004, Science.

[68]  David J. Arenillas,et al.  oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes , 2005, Nucleic acids research.

[69]  Peter A. C. 't Hoen,et al.  CORE_TF: a user-friendly interface to identify evolutionary conserved transcription factor binding sites in sets of co-regulated genes , 2008, BMC Bioinformatics.

[70]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[71]  H. Stein,et al.  Loss of PU.1 expression is associated with defective immunoglobulin transcription in Hodgkin and Reed-Sternberg cells of classical Hodgkin disease. , 2002, Blood.

[72]  Hanah Margalit,et al.  Chromosomal organization is shaped by the transcription regulatory network. , 2005, Trends in genetics : TIG.

[73]  P. Papathanos,et al.  Transcription Regulation of Sex-Biased Genes during Ontogeny in the Malaria Vector Anopheles gambiae , 2011, PloS one.

[74]  K. Al-Kuraya,et al.  The biological and clinical impact of inhibition of NF‐κB‐initiated apoptosis in diffuse large B cell lymphoma (DLBCL) , 2011, The Journal of pathology.

[75]  Christopher J. Tonkin,et al.  Dissecting Apicoplast Targeting in the Malaria Parasite Plasmodium falciparum , 2003, Science.