Discovering discriminative graph patterns from gene expression data

We consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/unhealthy samples of an input dataset. We present an approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attempt to build a different graph for each sample and, then, to have a database of graphs for representing a sample set. Our main goal is that of singling out interesting differences between healthy and unhealthy samples, through the extraction of "discriminative patterns" among graphs belonging to the two different sample sets. Differently from the other approaches presented in the literature, our techniques is able to take into account important local similarities, and also collaborative effects involving interactions between multiple genes. In particular, we use edge-labelled graphs and we measure the discriminative power of a pattern based on such edge weights, which are representative of how much relevant is the co-expression between two genes.

[1]  R. Gray Entropy and Information Theory , 1990, Springer New York.

[2]  Partha S. Vasisht Computational Analysis of Microarray Data , 2003 .

[3]  Tian Zheng,et al.  Identification of gene interactions associated with disease from gene expression data using synergy networks , 2008, BMC Systems Biology.

[4]  Simona E. Rombo,et al.  Searching for repetitions in biological networks: methods, resources and tools , 2015, Briefings Bioinform..

[5]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[6]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[7]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[8]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[9]  Jun Zhang,et al.  Characterization of Differentially Expressed Genes Involved in Pathways Associated with Gastric Cancer , 2015, PloS one.

[10]  D. Anastassiou Computational analysis of the synergy among multiple interacting genes , 2007, Molecular systems biology.

[11]  Jugal K. Kalita,et al.  Reconstruction of gene co-expression network from microarray data using local expression patterns , 2014, BMC Bioinformatics.

[12]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[13]  Matthias Dehmer,et al.  Applied Statistics for Network Biology: Methods in Systems Biology , 2011 .

[14]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[15]  F. Emmert-Streib,et al.  Harnessing the complexity of gene expression data from cancer: from single gene to structural pathway methods , 2012, Biology Direct.

[16]  Roded Sharan,et al.  Comparative analysis of protein networks , 2012, Commun. ACM.