Pathway recognition and augmentation by computational analysis of microarray expression data

MOTIVATION We present a system, QPACA (Quantitative Pathway Analysis in Cancer) for analysis of biological data in the context of pathways. QPACA supports data visualization and both fine- and coarse-grained specifications, but, more importantly, addresses the problems of pathway recognition and pathway augmentation. RESULTS Given a set of genes hypothesized to be part of a pathway or a coordinated process, QPACA is able to reliably distinguish true pathways from non-pathways using microarray expression data. Relying on the observation that only some of the experiments within a dataset are relevant to a specific biochemical pathway, QPACA automates selection of this subset using an optimization procedure. We present data on all human and yeast pathways found in the KEGG pathway database. In 117 out of 191 cases (61%), QPACA was able to correctly identify these positive cases as bona fide pathways with p-values measured using rigorous permutation analysis. Success in recognizing pathways was dependent on pathway size, with the largest quartile of pathways yielding 83% success. In cross-validation tests of pathway membership prediction, QPACA was able to yield enrichments for predicted pathway genes over random genes at rates of 2-fold or better the majority of the time, with rates of 10-fold or better 10-20% of the time. AVAILABILITY The software is available for academic research use free of charge by email request. SUPPLEMENTARY INFORMATION Data used in the paper may be downloaded from http://www.jainlab.org/downloads.html

[1]  G. Farrell,et al.  Specific involvement of G(alphai2) with epidermal growth factor receptor signaling in rat hepatocytes, and the inhibitory effect of chronic ethanol. , 2001, Biochemical pharmacology.

[2]  Peter D. Karp,et al.  Evaluation of computational metabolic-pathway predictions for Helicobacter pylori , 2002, Bioinform..

[3]  G. Peters,et al.  Evidence for different modes of action of cyclin-dependent kinase inhibitors: p15 and p16 bind to kinases, p21 and p27 bind to cyclins. , 1995, Oncogene.

[4]  D Broek,et al.  Control of Intramolecular Interactions between the Pleckstrin Homology and Dbl Homology Domains of Vav and Sos1 Regulates Rac Binding* , 2000, The Journal of Biological Chemistry.

[5]  Fabrice P Cordelières,et al.  The IGF-1/Akt pathway is neuroprotective in Huntington's disease and involves Huntingtin phosphorylation by Akt. , 2002, Developmental cell.

[6]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[7]  J. Mesirov,et al.  Chemosensitivity prediction by transcriptional profiling , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  J. Woodgett,et al.  Phosphoinositide-3-OH kinase-dependent regulation of glycogen synthase kinase 3 and protein kinase B/AKT by the integrin-linked kinase. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[9]  S. Volinia,et al.  Nuclear association of tyrosine‐phosphorylated Vav to phospholipase C‐γ1 and phosphoinositide 3‐kinase during granulocytic differentiation of HL‐60 cells , 1998, FEBS letters.

[10]  Michael R. Green,et al.  Dissecting the Regulatory Circuitry of a Eukaryotic Genome , 1998, Cell.

[11]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[12]  Roger E Bumgarner,et al.  Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. , 2001, Science.

[13]  F Lang,et al.  Molecular analysis of Ras activation by tyrosine phosphorylated Vav. , 1995, Biochemical and biophysical research communications.

[14]  T. Maeda,et al.  Human p55(CDC)/Cdc20 associates with cyclin A and is phosphorylated by the cyclin A-Cdk2 complex. , 2000, Biochemical and biophysical research communications.

[15]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[16]  Tim Hui-Ming Huang,et al.  Isolating human transcription factor targets by coupling chromatin immunoprecipitation and CpG island microarray analysis. , 2002, Genes & development.

[17]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[18]  H. Joost,et al.  Sequence Characteristics, Subcellular Localization, and Substrate Specificity of DYRK-related Kinases, a Novel Family of Dual Specificity Protein Kinases* , 1998, The Journal of Biological Chemistry.

[19]  E Schaefer,et al.  The MAP kinase kinase kinase MLK2 co‐localizes with activated JNK along microtubules and associates with kinesin superfamily motor KIF3 , 1998, The EMBO journal.

[20]  I. Kola,et al.  FLI1 and EWS-FLI1 function as ternary complex factors and ELK1 and SAP1a function as ternary and quaternary complex factors on the Egr1 promoter serum response elements , 1997, Oncogene.

[21]  A. Kimura,et al.  Chromosomal gradient of histone acetylation established by Sas2p and Sir2p functions as a shield against gene silencing , 2002, Nature Genetics.

[22]  O. Hobert,et al.  SH3 domain-dependent interaction of the proto-oncogene product Vav with the focal contact protein zyxin. , 1996, Oncogene.

[23]  Frederick W. Alt,et al.  Vav Family Proteins Couple to Diverse Cell Surface Receptors , 2000, Molecular and Cellular Biology.

[24]  S. Korsmeyer,et al.  Bad, a heterodimeric partner for Bcl-xL and Bcl-2, displaces bax and promotes cell death , 1995, Cell.

[25]  J. Marshall,et al.  Activation of MLK2-mediated Signaling Cascades by Polyglutamine-expanded Huntingtin* , 2000, The Journal of Biological Chemistry.

[26]  Mitch Weiss,et al.  DYRK gene structure and erythroid-restricted features of DYRK3 gene expression. , 2005, Genomics.

[27]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[28]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..