Discretization Provides a Conceptually Simple Tool to Build Expression Networks

Biomarker identification, using network methods, depends on finding regular co-expression patterns; the overall connectivity is of greater importance than any single relationship. A second requirement is a simple algorithm for ranking patients on how relevant a gene-set is. For both of these requirements discretized data helps to first identify gene cliques, and then to stratify patients. We explore a biologically intuitive discretization technique which codes genes as up- or down-regulated, with values close to the mean set as unchanged; this allows a richer description of relationships between genes than can be achieved by positive and negative correlation. We find a close agreement between our results and the template gene-interactions used to build synthetic microarray-like data by SynTReN, which synthesizes “microarray” data using known relationships which are successfully identified by our method. We are able to split positive co-regulation into up-together and down-together and negative co-regulation is considered as directed up-down relationships. In some cases these exist in only one direction, with real data, but not with the synthetic data. We illustrate our approach using two studies on white blood cells and derived immortalized cell lines and compare the approach with standard correlation-based computations. No attempt is made to distinguish possible causal links as the search for biomarkers would be crippled by losing highly significant co-expression relationships. This contrasts with approaches like ARACNE and IRIS. The method is illustrated with an analysis of gene-expression for energy metabolism pathways. For each discovered relationship we are able to identify the samples on which this is based in the discretized sample-gene matrix, along with a simplified view of the patterns of gene expression; this helps to dissect the gene-sample relevant to a research topic - identifying sets of co-regulated and anti-regulated genes and the samples or patients in which this relationship occurs.

[1]  Hiroaki Kitano,et al.  Large-Scale Analysis of Network Bistability for Human Cancers , 2010, PLoS Comput. Biol..

[2]  Yong-shu He,et al.  [Structural variation in the human genome]. , 2009, Yi chuan = Hereditas.

[3]  A. Balmain,et al.  Systems genetics analysis of cancer susceptibility: from mouse models to humans , 2009, Nature Reviews Genetics.

[4]  Ralf Herwig,et al.  GeNGe: systematic generation of gene regulatory networks , 2009, Bioinform..

[5]  Michele Ceccarelli,et al.  IRIS: a method for reverse engineering of regulatory relations in gene networks , 2009, BMC Bioinformatics.

[6]  D. Bartel,et al.  The impact of microRNAs on protein output , 2008, Nature.

[7]  H. Stefánsson,et al.  Genetics of gene expression and its effect on disease , 2008, Nature.

[8]  Desmond J. Higham,et al.  Multidimensional partitioning and bi-partitioning: analysis and application to gene expression data sets , 2008, Int. J. Comput. Math..

[9]  Vasyl Pihur,et al.  Reconstruction of genetic association networks from microarray data: a partial least squares approach , 2008, Bioinform..

[10]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[11]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[12]  L. Almasy,et al.  Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes , 2007, Nature Genetics.

[13]  Gabriela Kalna,et al.  Spectral analysis of two-signed microarray expression data. , 2007, Mathematical medicine and biology : a journal of the IMA.

[14]  D. Levanon,et al.  Runx3 regulates dendritic epidermal T cell development. , 2007, Developmental biology.

[15]  Joshua T. Burdick,et al.  Common genetic variants account for differences in gene expression among ethnic groups , 2007, Nature Genetics.

[16]  Gabriela Kalna,et al.  Divergent routes to oral cancer. , 2006, Cancer research.

[17]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[18]  Kathleen Marchal,et al.  SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms , 2006, BMC Bioinformatics.

[19]  Joshua T. Burdick,et al.  Mapping determinants of human gene expression by regional and genome-wide association , 2005, Nature.

[20]  Martha R. Stampfer,et al.  Chromatin Inactivation Precedes De Novo DNA Methylation during the Progressive Epigenetic Silencing of the RASSF1A Promoter , 2005, Molecular and Cellular Biology.

[21]  C. Molony,et al.  Genetic analysis of genome-wide variation in human gene expression , 2004, Nature.

[22]  E. Cameron,et al.  The Runx genes: lineage-specific oncogenes and tumor suppressors , 2004, Oncogene.

[23]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[24]  John Quackenbush Microarrays--Guilt by Association , 2003, Science.

[25]  Satoru Miyano,et al.  Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection , 2003, ECCB.

[26]  Ash A. Alizadeh,et al.  Individuality and variation in gene expression patterns in human blood , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Partha S. Vasisht Computational Analysis of Microarray Data , 2003 .

[28]  John Quackenbush Genomics. Microarrays--guilt by association. , 2003, Science.

[29]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[30]  A. W. Kemp,et al.  Randomization, Bootstrap and Monte Carlo Methods in Biology , 1997 .

[31]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[32]  Rob J Hyndman,et al.  Sample Quantiles in Statistical Packages , 1996 .

[33]  N. Copeland,et al.  Loss of heterozygosity in three embryonal tumours suggests a common pathogenetic mechanism , 1985, Nature.

[34]  D. George,et al.  Cloning of DNA from double minutes of Y1 mouse adrenocortical tumor cells: Evidence for gene amplification , 1981, Cell.

[35]  J. Rowley,et al.  Acute Monocytic Leukemia: Cytologic, Histologic, Cytochemical, Ultrastructural, and Cytogenetic Observations , 1976 .

[36]  H. Weiss,et al.  Acute monocytic leukemia. , 1946, The American journal of medicine.