Extracting three-way gene interactions from microarray data

MOTIVATION It is an important and difficult task to extract gene network information from high-throughput genomic data. A common approach is to cluster genes using pairwise correlation as a distance metric. However, pairwise correlation is clearly too simplistic to describe the complex relationships among real genes since co-expression relationships are often restricted to a specific set of biological conditions/processes. In this study, we described a three-way gene interaction model that captures the dynamic nature of co-expression relationship between a gene pair through the introduction of a controller gene. RESULTS We surveyed 0.4 billion possible three-way interactions among 1000 genes in a microarray dataset containing 678 human cancer samples. To test the reproducibility and statistical significance of our results, we randomly split the samples into a training set and a testing set. We found that the gene triplets with the strongest interactions (i.e. with the smallest P-values from appropriate statistical tests) in the training set also had the strongest interactions in the testing set. A distinctive pattern of three-way interaction emerged from these gene triplets: depending on the third gene being expressed or not, the remaining two genes can be either co-expressed or mutually exclusive (i.e. expression of either one of them would repress the other). Such three-way interactions can exist without apparent pairwise correlations. The identified three-way interactions may constitute candidates for further experimentation using techniques such as RNA interference, so that novel gene network or pathways could be identified.

[1]  A. Kimura,et al.  Chromosomal gradient of histone acetylation established by Sas2p and Sir2p functions as a shield against gene silencing , 2002, Nature Genetics.

[2]  Andreas Wagner,et al.  Estimating coarse gene network structure from large-scale gene perturbation data. , 2002, Genome research.

[3]  Simon Lin,et al.  Methods of microarray data analysis III , 2002 .

[4]  C. Rao,et al.  Control motifs for intracellular regulatory networks. , 2001, Annual review of biomedical engineering.

[5]  E. Lander Array of hope , 1999, Nature Genetics.

[6]  Andrey Rzhetsky,et al.  Birth of scale-free molecular networks and the number of distinct DNA and protein domains per genome , 2001, Bioinform..

[7]  B. Efron Correlation and Large-Scale Simultaneous Significance Testing , 2007 .

[8]  P. Woolf,et al.  A fuzzy logic approach to analyzing gene expression data. , 2000, Physiological genomics.

[9]  Michael Q. Zhang Extracting functional information from microarrays: A challenge for functional genomics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  M. Gerstein,et al.  The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties , 2002, Genome Biology.

[11]  Yutao Fu,et al.  Gene expression module discovery using gibbs sampling. , 2004, Genome informatics. International Conference on Genome Informatics.

[12]  J. Tchinda,et al.  Recurrent Fusion of TMPRSS2 and ETS Transcription Factor Genes in Prostate Cancer , 2005, Science.

[13]  Yuan Ji,et al.  Applications of beta-mixture models in bioinformatics , 2005, Bioinform..

[14]  Ker-Chau Li,et al.  Genome-wide coexpression dynamics: Theory and application , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[15]  R. Pilz,et al.  Transcriptional elongation of c-myb is regulated by NF-κB (p50/RelB) , 1999, Oncogene.

[16]  Giovanni Parmigiani,et al.  Searching for differentially expressed gene combinations , 2005, Genome Biology.

[17]  Sangsoo Kim,et al.  Gene expression Differential coexpression analysis using microarray data and its application to human cancer , 2005 .

[18]  M Suhasini,et al.  Transcriptional elongation of c-myb is regulated by NF-kappaB (p50/RelB). , 1999, Oncogene.

[19]  A. Orth,et al.  Large-scale analysis of the human and mouse transcriptomes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[20]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[22]  John Quackenbush Genomics. Microarrays--guilt by association. , 2003, Science.

[23]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[24]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[25]  Bin Yu,et al.  Simultaneous Gene Clustering and Subset Selection for Sample Classification Via MDL , 2003, Bioinform..

[26]  Aleksander Edelman,et al.  NF-κB mediates up-regulation of the CFTR gene expression by interleukin-1β in Calu-3 cells , 2000 .

[27]  Kerby Shedden,et al.  Differential Correlation Detects Complex Associations Between Gene Expression and Clinical Outcomes in Lung Adenocarcinomas , 2005 .

[28]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[29]  J. Tchinda,et al.  Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. , 2006, Science.

[30]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence Project: update and current status , 2003, Nucleic Acids Res..

[31]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[32]  S. Rafii,et al.  Splitting vessels: Keeping lymph apart from blood , 2003, Nature Medicine.

[33]  D. Eisenberg,et al.  Use of Logic Relationships to Decipher Protein Network Organization , 2004, Science.

[34]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[35]  Veronique De Bruyne,et al.  Methods for microarray data analysis. , 2007, Methods in molecular biology.

[36]  J. Cavanaugh Biostatistics , 2005, Definitions.

[37]  K. Aldape,et al.  A model of molecular interactions on short oligonucleotide microarrays , 2003, Nature Biotechnology.

[38]  Araceli M. Huerta,et al.  From specific gene regulation to genomic networks: a global analysis of transcriptional regulation in Escherichia coli. , 1998, BioEssays : news and reviews in molecular, cellular and developmental biology.

[39]  C. Brown,et al.  Determination of X-chromosome inactivation status using X-linked expressed polymorphisms identified by database searching. , 2000, Genomics.

[40]  John Quackenbush Microarrays--Guilt by Association , 2003, Science.