Biomarker Identification by Knowledge-Driven Multi-Level ICA and Motif Analysis

Many statistical methods often fail to identify biologically meaningful biomarkers related to a specific disease under study from expression data alone. In this paper, we develop a novel strategy, namely knowledge-driven multi-level independent component analysis (ICA), to infer regulatory signals and identify biologically relevant biomarkers from microarray data. Specifically, based on multi-level clustering results and partial prior knowledge, we apply ICA to find stable disease specific linear regulatory modes and then extract associated biomarker genes. A statistical test is designed to evaluate the significance of transcription factor enrichment for extracted gene set based on motif information. The experimental results on an Rsf-1 induced microarray data set show that our knowledge-driven method can extract more biologically meaningful biomarkers with significant enrichment of transcription factors related to ovarian cancer compared to other gene selection methods with/without prior knowledge.

[1]  J. Devore,et al.  Statistics: The Exploration and Analysis of Data , 1986 .

[2]  Byoung-Tak Zhang,et al.  Identification of regulatory modules by co-clustering latent variable models: stem cell differentiation , 2006, Bioinform..

[3]  S. Batzoglou,et al.  Application of independent component analysis to microarrays , 2003, Genome Biology.

[4]  Zhiping Weng,et al.  PromoSer: a large-scale mammalian promoter and transcription start site identification service , 2003, Nucleic Acids Res..

[5]  Lei Xu Ovarian cancer angiogenesis, biology and therapy , 2000 .

[6]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[7]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[8]  John D. Storey,et al.  Significance analysis of time course microarray experiments. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Giovanni Parmigiani,et al.  Amplification of a chromatin remodeling gene, Rsf-1/HBXAP, in ovarian carcinoma. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Masato Inoue,et al.  BLIND GENE CLASSIFICATION BASED ON ICA OF MICROARRAY DATA , 2001 .

[11]  Wolfram Liebermeister,et al.  Linear modes of gene expression determined by independent component analysis , 2002, Bioinform..

[12]  Robert Clarke,et al.  Motif-directed network component analysis for regulatory network inference , 2008, BMC Bioinformatics.

[13]  E. Wingender,et al.  MATCH: A tool for searching transcription factor binding sites in DNA sequences. , 2003, Nucleic acids research.

[14]  D. Chakrabarti,et al.  A fast fixed - point algorithm for independent component analysis , 1997 .

[15]  J. Richards,et al.  Regulation of AP1 (Jun/Fos) Factor Expression and Activation in Ovarian Granulosa Cells , 2000, The Journal of Biological Chemistry.

[16]  S. Schneider-Maunoury,et al.  Multiple pituitary and ovarian defects in Krox-24 (NGFI-A, Egr-1)-targeted mice. , 1998, Molecular endocrinology.

[17]  Guide to Probe Logarithmic Intensity Error ( PLIER ) Estimation , 2005 .

[18]  Ying Wang,et al.  IL-8 Reduced Tumorigenicity of Human Ovarian Cancer In Vivo Due to Neutrophil Infiltration1 , 2000, The Journal of Immunology.

[19]  Ralph S Freedman,et al.  Ovarian cancer, the coagulation pathway, and inflammation , 2005, Journal of Translational Medicine.

[20]  Sheng‐Chung Lee,et al.  Functional interaction between nuclear matrix-associated HBXAP and NF-kappaB. , 2004, Experimental cell research.

[21]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[22]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Karin Milde-Langosch,et al.  The Fos family of transcription factors and their role in tumourigenesis. , 2005, European journal of cancer.

[24]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..