Cluster Switches in Gene Expression Data

Following the sequencing of the human genome, the next step is to understand the function of all genes in health and disease. However, experimental study of the functions of all genes in all diseases is impossible and unnecessary, as not all genes are functional in all conditions. However, understanding which genes are functional in each condition and how they are regulated requires a laborious and expensive experimental effort. In this paper we suggest a heuristic framework, CSGI (cluster switching genes identification) for identifying promising genes for thorough analysis. In CSGI we project a cluster defined in one context to its projection in another context, identifying genes that behave differently in different contexts.We provide a case study of immune system clusters showing that our approach identifies clusters representing core conserved biological processes, as well as important genes that switch of clusters.

[1]  Su-In Lee,et al.  Node-based learning of multiple Gaussian graphical models , 2013, J. Mach. Learn. Res..

[2]  Francisco Azuaje,et al.  Cluster validation techniques for genome expression data , 2003, Signal Process..

[3]  Alicia Troncoso Lora,et al.  Partitioning-Clustering Techniques Applied to the Electricity Price Time Series , 2007, IDEAL.

[4]  M. Gerstein,et al.  The current excitement in bioinformatics-analysis of whole-genome expression data: how does it relate to protein structure and function? , 2000, Current opinion in structural biology.

[5]  Judy H Cho,et al.  Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis , 2007, Nature Genetics.

[6]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[7]  E. Domany,et al.  The Wnt inhibitory factor 1 (WIF1) is targeted in glioblastoma and has a tumor suppressing function potentially by induction of senescence. , 2011, Neuro-oncology.

[8]  Patrik D'haeseleer,et al.  How does gene expression clustering work? , 2005, Nature Biotechnology.

[9]  Jill P. Mesirov,et al.  Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data , 2003, Machine Learning.

[10]  Aviv Regev,et al.  Strategies to discover regulatory circuits of the mammalian immune system , 2011, Nature Reviews Immunology.

[11]  Jin Hwan Do,et al.  Clustering approaches to identifying gene expression patterns from DNA microarray data. , 2008, Molecules and cells.

[12]  Hong Yan,et al.  Incorporating prior information into differential network analysis using non‐paranormal graphical models , 2017, Bioinform..

[13]  D. Koller,et al.  From signatures to models: understanding cancer using microarrays , 2005, Nature Genetics.

[14]  Atul J. Butte,et al.  Quantifying the relationship between co-expression, co-regulation and gene function , 2004, BMC Bioinformatics.

[15]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[16]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[17]  S. Oliver Proteomics: Guilt-by-association goes global , 2000, Nature.

[18]  Gabriele Schackert,et al.  Molecular characterization of long‐term survivors of glioblastoma using genome‐ and transcriptome‐wide profiling , 2014, International journal of cancer.

[19]  D. Koller,et al.  Conservation and divergence in the transcriptional programs of the human and mouse immune systems , 2013, Proceedings of the National Academy of Sciences.

[20]  Eytan Domany,et al.  The promoters of human cell cycle genes integrate signals from two tumor suppressive pathways during cellular transformation , 2005, Molecular systems biology.

[21]  Ruibin Xi,et al.  Differential Network Analysis via the Lasso Penalized D-Trace Loss , 2015, 1511.09188.

[22]  Rui Luo,et al.  Is My Network Module Preserved and Reproducible? , 2011, PLoS Comput. Biol..

[23]  Ezekiel Adebiyi,et al.  Clustering Algorithms: Their Application to Gene Expression Data , 2016, Bioinformatics and biology insights.

[24]  Tien Yin Wong,et al.  Genome-wide association study identifies FCGR2A as a susceptibility locus for Kawasaki disease , 2011, Nature Genetics.