Multiscale Embedded Gene Co-expression Network Analysis

Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

[1]  Dario Floreano,et al.  GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods , 2011, Bioinform..

[2]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[3]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[4]  D. Hanahan,et al.  The Hallmarks of Cancer , 2000, Cell.

[5]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[6]  Todd M Umstead,et al.  Essential regulation of lung surfactant homeostasis by the orphan G protein-coupled receptor GPR116. , 2013, Cell reports.

[7]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[8]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[9]  E Valousková,et al.  Redistribution of cell death-inducing DNA fragmentation factor-like effector-a (CIDEa) from mitochondria to nucleus is associated with apoptosis in HeLa cells. , 2008, General physiology and biophysics.

[10]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[11]  John M. Boyer,et al.  Stop Minding Your P's and Q's: Implementing a Fast and Simple DFS-Based Planarity Testing and Embedding Algorithm , 2003, GD.

[12]  Tiziana di Matteo,et al.  Hierarchical Information Clustering by Means of Topologically Embedded Graphs , 2011, PloS one.

[13]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[14]  Tiziana di Matteo,et al.  Nested hierarchies in planar graphs , 2009, Discret. Appl. Math..

[15]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[16]  Mark E. J. Newman A measure of betweenness centrality based on random walks , 2005, Soc. Networks.

[17]  R. Schiff,et al.  PDEF promotes luminal differentiation and acts as a survival factor for ER-positive breast cancer cells. , 2013, Cancer cell.

[18]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[19]  Linyuan Lü,et al.  Similarity index based on local paths for link prediction of complex networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Tiziana di Matteo,et al.  Centrality and Peripherality in Filtered Graphs from Dynamical Financial Correlations , 2008, Adv. Complex Syst..

[21]  Atul J. Butte,et al.  Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks , 2005, BMC Bioinformatics.

[22]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[23]  D. Botstein,et al.  Variation in gene expression patterns in human gastric cancers. , 2003, Molecular biology of the cell.

[24]  Partha S. Ray,et al.  Basal-Like Breast Cancer Defined by FOXC1 Expression Offers Superior Prognostic Value: A Retrospective Immunohistochemical Study , 2011, Annals of Surgical Oncology.

[25]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[26]  T. Aste,et al.  The use of dynamical networks to detect the hierarchical organization of financial market sectors , 2010 .

[27]  A E Giuliano,et al.  FOXC1 regulates the functions of human basal-like breast cancer cells by activating NF-κB signaling , 2012, Oncogene.

[28]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[29]  M Tumminello,et al.  A tool for filtering information in complex systems. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Antonio Reverter,et al.  PCIT: an R package for weighted gene co-expression networks based on partial correlation and information theory approaches , 2010, Bioinform..

[31]  Weixiong Zhang,et al.  A general co-expression network-based approach to gene expression analysis: comparison and applications , 2010, BMC Systems Biology.

[32]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  T. Di Matteo,et al.  Complex networks on hyperbolic surfaces , 2004, cond-mat/0408443.

[34]  P. Lønning Poor-prognosis estrogen receptor- positive disease: present and future clinical solutions , 2012, Therapeutic advances in medical oncology.

[35]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[36]  S. Horvath,et al.  A General Framework for Weighted Gene Co-Expression Network Analysis , 2005, Statistical applications in genetics and molecular biology.

[37]  A. Nobel,et al.  Supervised risk predictor of breast cancer based on intrinsic subtypes. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[38]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[39]  V. Theodorou,et al.  GATA3 acts upstream of FOXA1 in mediating ESR1 binding by shaping enhancer accessibility , 2013, Genome research.

[40]  L. Tran,et al.  Integrated Systems Approach Identifies Genetic Nodes and Networks in Late-Onset Alzheimer’s Disease , 2013, Cell.

[41]  C. Print,et al.  Cell Cycle Gene Networks Are Associated with Melanoma Prognosis , 2012, PloS one.

[42]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[43]  S. Rashid,et al.  Hallmarks of Cancer Cell , 2017 .

[44]  A. Mobasheri,et al.  Aquaporin Water Channels in the Mammary Gland: From Physiology to Pathophysiology and Neoplasia , 2013, Journal of Mammary Gland Biology and Neoplasia.

[45]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[46]  I. Mills,et al.  Androgen receptor driven transcription in molecular apocrine breast cancer is mediated by FoxA1 , 2011, The EMBO journal.

[47]  J. Whitsett,et al.  Orphan G protein-coupled receptor GPR116 regulates pulmonary surfactant pool size. , 2013, American journal of respiratory cell and molecular biology.

[48]  Gernot Neumayer,et al.  TPX2: of spindle assembly, DNA damage response, and cancer , 2014, Cellular and Molecular Life Sciences.

[49]  E. Dahl,et al.  BDNF Is Associated with SFRP1 Expression in Luminal and Basal-Like Breast Cancer Cell Lines and Primary Breast Cancer Tissues: A Novel Role in Tumor Suppression? , 2014, PloS one.

[50]  P. Arner,et al.  Evidence for an important role of CIDEA in human cancer cachexia. , 2008, Cancer research.

[51]  T. Britton,et al.  Lipolysis—Not inflammation, cell death, or lipogenesis—Is involved in adipose tissue loss in cancer cachexia , 2008, Cancer.

[52]  Michalis Vazirgiannis,et al.  On Clustering Validation Techniques , 2001, Journal of Intelligent Information Systems.

[53]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[54]  Claudio Altafini,et al.  Comparing association network algorithms for reverse engineering of large-scale gene regulatory networks: synthetic versus real data , 2007, Bioinform..

[55]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[56]  D. Nguyen,et al.  Control of alveolar differentiation by the lineage transcription factors GATA6 and HOPX inhibits lung adenocarcinoma metastasis. , 2013, Cancer cell.

[57]  Roberto F. S. Andrade,et al.  Spectral properties of the Apollonian network , 2005 .

[58]  B. Zheng,et al.  Structure of local interactions in complex financial dynamics , 2014, Scientific Reports.

[59]  Yana Zhang,et al.  A yeast two‐hybrid system using Sp17 identified Ropporin as a novel cancer–testis antigen in hematologic malignancies , 2007, International journal of cancer.

[60]  Guoqing Wang,et al.  Gene-Expression Signatures Can Distinguish Gastric Cancer Grades and Stages , 2011, PloS one.

[61]  T Aste,et al.  Building complex networks with Platonic solids. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[62]  S. Wakana,et al.  Lung Surfactant Levels are Regulated by Ig-Hepta/GPR116 by Monitoring Surfactant Protein D , 2013, PloS one.

[63]  P. Francesco,et al.  Geodesic distance in planar graphs , 2003, cond-mat/0303272.

[64]  S. Sizemore,et al.  The Forkhead Box Transcription Factor FOXC1 Promotes Breast Cancer Invasion by Inducing Matrix Metalloprotease 7 (MMP7) Expression* , 2012, The Journal of Biological Chemistry.

[65]  Julio Saez-Rodriguez,et al.  Crowdsourcing Network Inference: The DREAM Predictive Signaling Network Challenge , 2011, Science Signaling.

[66]  David Chen,et al.  ESR1 ligand binding domain mutations in hormone-resistant breast cancer , 2013, Nature Genetics.

[67]  David P Turner,et al.  Transcriptional Regulation of p21/CIP1 Cell Cycle Inhibitor by PDEF Controls Cell Proliferation and Mammary Tumor Progression* , 2010, The Journal of Biological Chemistry.

[68]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[69]  G. Bernardini,et al.  Cancer Testis Antigen, Ropporin, Is a Potential Target for Multiple Myeloma Immunotherapy , 2011, Journal of immunotherapy.

[70]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[71]  Michael Griffin,et al.  Gene co-expression network topology provides a framework for molecular characterization of cellular state , 2004, Bioinform..