Along signal paths: an empirical gene set approach exploiting pathway topology

Gene set analysis using biological pathways has become a widely used statistical approach for gene expression analysis. A biological pathway can be represented through a graph where genes and their interactions are, respectively, nodes and edges of the graph. From a biological point of view only some portions of a pathway are expected to be altered; however, few methods using pathway topology have been proposed and none of them tries to identify the signal paths, within a pathway, mostly involved in the biological problem. Here, we present a novel algorithm for pathway analysis clipper, that tries to fill in this gap. clipper implements a two-step empirical approach based on the exploitation of graph decomposition into a junction tree to reconstruct the most relevant signal path. In the first step clipper selects significant pathways according to statistical tests on the means and the concentration matrices of the graphs derived from pathway topologies. Then, it identifies within these pathways the signal paths having the greatest association with a specific phenotype. We test our approach on simulated and two real expression datasets. Our results demonstrate the efficacy of clipper in the identification of signal transduction paths totally coherent with the biological problem.

[1]  Robert Castelo,et al.  A Robust Procedure For Gaussian Graphical Model Search From Microarray Data With p Larger Than n , 2006, J. Mach. Learn. Res..

[2]  Frank Emmert-Streib,et al.  The Chronic Fatigue Syndrome: A Comparative Pathway Analysis , 2007, J. Comput. Biol..

[3]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Lei Wang,et al.  Network-enabled gene expression analysis , 2012, BMC Bioinformatics.

[5]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[6]  Monica Chiogna,et al.  Gene set analysis exploiting the topology of a pathway , 2010, BMC Systems Biology.

[7]  P. Khatri,et al.  A systems biology approach for pathway level analysis. , 2007, Genome research.

[8]  Isabelle Richard,et al.  Mutations in the proteolytic enzyme calpain 3 cause limb-girdle muscular dystrophy type 2A , 1995, Cell.

[9]  Korbinian Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology , 2005 .

[10]  S. Dudoit,et al.  Gains in Power from Structured Two-Sample Tests of Means on Graphs , 2010, 1009.5173.

[11]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[12]  P. Park,et al.  Discovering statistically significant pathways in expression profiling studies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  E. Kudryashova,et al.  Calpain 3 participates in sarcomere remodeling by acting upstream of the ubiquitin-proteasome pathway. , 2007, Human molecular genetics.

[14]  Pooja Mittal,et al.  A novel signaling pathway impact analysis , 2009, Bioinform..

[15]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[16]  Cengizhan Ozturk,et al.  Pathway analysis of high-throughput biological data within a Bayesian network framework , 2011, Bioinform..

[17]  Timothy W Gant,et al.  MicroRNA expression profiling in patients with lamin A/C‐associated muscular dystrophy , 2011, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[18]  Peter Bühlmann,et al.  Analyzing gene expression data in terms of gene sets: methodological issues , 2007, Bioinform..

[19]  Isabelle Richard,et al.  A new pathway encompassing calpain 3 and its newly identified substrate cardiac ankyrin repeat protein is involved in the regulation of the nuclear factor‐κB pathway in skeletal muscle , 2010, The FEBS journal.

[20]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[21]  Johan T den Dunnen,et al.  Calpain 3 is a modulator of the dysferlin protein complex in skeletal muscle. , 2008, Human molecular genetics.

[22]  Kenneth H. Buetow,et al.  PID: the Pathway Interaction Database , 2008, Nucleic Acids Res..

[23]  Korbinian Strimmer,et al.  A general modular framework for gene set enrichment analysis , 2009, BMC Bioinformatics.

[24]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[26]  Robert Gentleman,et al.  Gene Expression Profiles of B-lineage Adult Acute Lymphocytic Leukemia Reveal Genetic Patterns that Identify Lineage Derivation and Distinct Mechanisms of Transformation , 2005, Clinical Cancer Research.

[27]  Y. Haupt,et al.  C-Abl as a modulator of p53. , 2005, Biochemical and biophysical research communications.

[28]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[29]  Gabriele Sales,et al.  graphite - a Bioconductor package to convert pathway topology to gene network , 2012, BMC Bioinformatics.

[30]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[31]  M. Kimmel,et al.  Conflict of interest statement. None declared. , 2010 .

[32]  R. Ren,et al.  Mechanisms of BCR–ABL in the pathogenesis of chronic myelogenous leukaemia , 2005, Nature Reviews Cancer.

[33]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[34]  B. Shneiderman,et al.  Nuclear envelope dystrophies show a transcriptional fingerprint suggesting disruption of Rb-MyoD pathways in muscle regeneration. , 2006, Brain : a journal of neurology.

[35]  Sandrine Dudoit,et al.  More power via graph-structured tests for differential expression of gene networks , 2012, 1206.6980.

[36]  Chien-Chang Chen,et al.  Defective membrane repair in dysferlin-deficient muscular dystrophy , 2003, Nature.