ClustEx2: Gene Module Identification using Density-Based Network Hierarchical Clustering

With the fast accumulation of large-scale omic data in cancer, it is easy to get a list of seed genes associated with any clinical phenotype change or anti-cancer drug perturbation. However, it remains a challenging task to functionally interpret these genes. Many studies indicate that genes work cooperatively as functional modules in complex cellular processes. More biological insights can be obtained by clustering these genes and their closely interacting neighbors in molecular networks as gene modules. Unlike the traditional network community detection, the gene modules involve in the seed genes that carry context specific information and their closely connected neighbors in the static molecular networks. We developed a new method ClustEx2, which can identify gene modules based on a set of user-defined seed genes in a given gene network. The method formulates the module identification in a unified framework according to a density-based hierarchical clustering method. ClustEx2 can incorporate both network topology and context specific information of seed genes along with their interactions, such as differential expressions and co-expressions. Its performance was systematically investigated for a known biological process, tumor necrosis factor induced inflammation, and it also helped obtain potential biological functions by analyzing the anticancer drug response associated modules using the TCGA data.

[1]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[2]  Yang Chen,et al.  Open Access Research Article Identification of Responsive Gene Modules by Network-based Gene Clustering and Extending: Application to Inflammation and Angiogenesis , 2022 .

[3]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[5]  Ting Chen,et al.  Integrative approaches for predicting protein function and prioritizing genes for complex phenotypes using protein interaction networks , 2014, Briefings Bioinform..

[6]  Jin Gu,et al.  Evaluating the molecule-based prediction of clinical drug responses in cancer , 2016, Bioinform..

[7]  Laura Fernández-Martín,et al.  Adherens junctions connect stress fibres between adjacent endothelial cells , 2010, BMC Biology.

[8]  K. Kohn,et al.  Using drug response data to identify molecular effectors, and molecular “omic” data to identify candidate drugs in cancer , 2014, Human Genetics.

[9]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Yang Chen,et al.  Time-course network analysis reveals TNF-α can promote G1/S transition of cell cycle in vascular endothelial cells , 2012, Bioinform..

[11]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[12]  Elias Campo Guerri,et al.  International network of cancer genome projects , 2010 .

[13]  P. Tchounwou,et al.  Cisplatin in cancer therapy: molecular mechanisms of action. , 2014, European journal of pharmacology.

[14]  Roded Sharan,et al.  Associating Genes and Protein Complexes with Disease via Network Propagation , 2010, PLoS Comput. Biol..

[15]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[16]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[17]  Jörg Sander Density-Based Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[18]  H. Birdsall,et al.  Focal effects of mononuclear leukocyte transendothelial migration: TNF‐α production by migrating monocytes promotes subsequent migration of lymphocytes , 1996, Journal of leukocyte biology.

[19]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[20]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[21]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[22]  Adam A. Margolin,et al.  Addendum: The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity , 2012, Nature.

[23]  Serban Nacu,et al.  Gene expression network analysis and applications to immunology , 2007, Bioinform..

[24]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[25]  Li Meng,et al.  Cisplatin-induced CCL5 secretion from CAFs promotes cisplatin-resistance in ovarian cancer via regulation of the STAT3 and PI3K/Akt signaling pathways. , 2016, International journal of oncology.

[26]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[27]  Robin Palotai,et al.  ModuLand plug-in for Cytoscape: determination of hierarchical layers of overlapping network modules and community centrality , 2011, Bioinform..

[28]  Zhengdong D. Zhang,et al.  SubNet: a Java application for subnetwork extraction , 2013, Bioinform..

[29]  Eli Upfal,et al.  Algorithms for Detecting Significantly Mutated Pathways in Cancer , 2010, RECOMB.

[30]  L. Galluzzi,et al.  Systems biology of cisplatin resistance: past, present and future , 2014, Cell Death and Disease.