Integrating Microarray Data and GRNs.

With the completion of the Human Genome Project and the emergence of high-throughput technologies, a vast amount of molecular and biological data are being produced. Two of the most important and significant data sources come from microarray gene-expression experiments and respective databanks (e,g., Gene Expression Omnibus-GEO (http://www.ncbi.nlm.nih.gov/geo)), and from molecular pathways and Gene Regulatory Networks (GRNs) stored and curated in public (e.g., Kyoto Encyclopedia of Genes and Genomes-KEGG (http://www.genome.jp/kegg/pathway.html), Reactome (http://www.reactome.org/ReactomeGWT/entrypoint.html)) as well as in commercial repositories (e.g., Ingenuity IPA (http://www.ingenuity.com/products/ipa)). The association of these two sources aims to give new insight in disease understanding and reveal new molecular targets in the treatment of specific phenotypes.Three major research lines and respective efforts that try to utilize and combine data from both of these sources could be identified, namely: (1) de novo reconstruction of GRNs, (2) identification of Gene-signatures, and (3) identification of differentially expressed GRN functional paths (i.e., sub-GRN paths that distinguish between different phenotypes). In this chapter, we give an overview of the existing methods that support the different types of gene-expression and GRN integration with a focus on methodologies that aim to identify phenotype-discriminant GRNs or subnetworks, and we also present our methodology.

[1]  Eytan Domany,et al.  Outcome signature genes in breast cancer: is there a unique set? , 2004, Breast Cancer Research.

[2]  Stuart A. Kauffman,et al.  The origins of order , 1993 .

[3]  Lajos Pusztai,et al.  Predicting prognosis of breast cancer with gene signatures: are we lost in a sea of data? , 2010, Genome Medicine.

[4]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[5]  George Potamias,et al.  Gene Selection via Discretized Gene-Expression Profiles and Greedy Feature-Elimination , 2004, SETN.

[6]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[7]  Gert Vriend,et al.  Correcting ligands, metabolites, and pathways , 2006, BMC Bioinformatics.

[8]  Ian O Ellis,et al.  Heregulin β1 drives gefitinib-resistant growth and invasion in tamoxifen-resistant MCF-7 breast cancer cells , 2007, Breast Cancer Research.

[9]  T Park,et al.  PATHOME: an algorithm for accurately detecting differentially expressed subpathways , 2014, Oncogene.

[10]  Zhiping Weng,et al.  Gene set enrichment analysis: performance evaluation and usage guidelines , 2012, Briefings Bioinform..

[11]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[12]  Pooja Mittal,et al.  A novel signaling pathway impact analysis , 2009, Bioinform..

[13]  Nicola J. Mulder,et al.  From sets to graphs: towards a realistic enrichment analysis of transcriptomic systems , 2011, Bioinform..

[14]  Dongxiao Zhu,et al.  TEAK: Topology Enrichment Analysis frameworK for detecting activated biological subpathways , 2012, Nucleic acids research.

[15]  Hua Xu,et al.  Advances in systems biology: computational algorithms and applications , 2012, BMC Systems Biology.

[16]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[17]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[18]  R. Sutherland Endocrine resistance in breast cancer: new roles for ErbB3 and ErbB4 , 2011, Breast Cancer Research.

[19]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..