A decision analysis model for KEGG pathway analysis

BackgroundThe knowledge base-driven pathway analysis is becoming the first choice for many investigators, in that it not only can reduce the complexity of functional analysis by grouping thousands of genes into just several hundred pathways, but also can increase the explanatory power for the experiment by identifying active pathways in different conditions. However, current approaches are designed to analyze a biological system assuming that each pathway is independent of the other pathways.ResultsA decision analysis model is developed in this article that accounts for dependence among pathways in time-course experiments and multiple treatments experiments. This model introduces a decision coefficient—a designed index, to identify the most relevant pathways in a given experiment by taking into account not only the direct determination factor of each Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway itself, but also the indirect determination factors from its related pathways. Meanwhile, the direct and indirect determination factors of each pathway are employed to demonstrate the regulation mechanisms among KEGG pathways, and the sign of decision coefficient can be used to preliminarily estimate the impact direction of each KEGG pathway. The simulation study of decision analysis demonstrated the application of decision analysis model for KEGG pathway analysis.ConclusionsA microarray dataset from bovine mammary tissue over entire lactation cycle was used to further illustrate our strategy. The results showed that the decision analysis model can provide the promising and more biologically meaningful results. Therefore, the decision analysis model is an initial attempt of optimizing pathway analysis methodology.

[1]  P. Park,et al.  Discovering statistically significant pathways in expression profiling studies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[2]  J. Zarzyńska,et al.  Apoptosis and autophagy in involuting bovine mammary gland. , 2008, Journal of physiology and pharmacology : an official journal of the Polish Physiological Society.

[3]  W. Hurley,et al.  Old and New Stories: Revelations from Functional Analysis of the Bovine Mammary Transcriptome during the Lactation Cycle , 2012, PloS one.

[4]  Ulrich Mansmann,et al.  GlobalANCOVA: exploration and assessment of gene group effects , 2008, Bioinform..

[5]  Xiaoli Xie,et al.  KEGG-PATH: Kyoto encyclopedia of genes and genomes-based pathway analysis using a path analysis model. , 2014, Molecular bioSystems.

[6]  V. Arango,et al.  Using the Gene Ontology for Microarray Data Mining: A Comparison of Methods and Application to Age Effects in Human Prefrontal Cortex , 2004, Neurochemical Research.

[7]  W. Hurley,et al.  A Novel Dynamic Impact Approach (DIA) for Functional Analysis of Time-Course Omics Studies: Validation Using the Bovine Mammary Transcriptome , 2012, PloS one.

[8]  Peter J. Park,et al.  A multivariate approach for integrating genome-wide expression data and biological knowledge , 2006, Bioinform..

[9]  P. Khatri,et al.  Global functional profiling of gene expression ? ? This work was funded in part by a Sun Microsystem , 2003 .

[10]  F. Sánchez-Juanes,et al.  Glycosphingolipids from bovine milk and milk fat globule membranes: a comparative study. Adhesion to enterotoxigenic Escherichia coli strains , 2009, Biological chemistry.

[11]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Joaquín Dopazo,et al.  Discovering molecular functions significantly related to phenotypes by combining gene expression data and biological information , 2005, Bioinform..

[13]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[14]  C. C. Calvert,et al.  Patterns of nutrient uptake by the mammary glands of lactating dairy cows. , 1991, Journal of dairy science.

[15]  Zhen Jiang,et al.  Bioconductor Project Bioconductor Project Working Papers Year Paper Extensions to Gene Set Enrichment , 2013 .

[16]  Seon-Young Kim,et al.  PAGE: Parametric Analysis of Gene Set Enrichment , 2005, BMC Bioinform..

[17]  P. Khatri,et al.  Profiling gene expression using onto-express. , 2002, Genomics.

[18]  T. Speed,et al.  GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.

[19]  Joaquín Dopazo,et al.  FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes , 2004, Bioinform..

[20]  Frank Emmert-Streib,et al.  Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets , 2009, Bioinform..

[21]  Jelle J. Goeman,et al.  A global test for groups of genes: testing association with a clinical outcome , 2004, Bioinform..

[22]  Hao Xiong,et al.  Non-linear tests for identifying differentially expressed genes or genetic networks , 2006, Bioinform..

[23]  P. Khatri,et al.  A systems biology approach for pathway level analysis. , 2007, Genome research.

[24]  J. Torrie,et al.  Principles and Procedures of Statistics with Special Reference to the Biological Sciences , 1962 .

[25]  Paul Pavlidis,et al.  ErmineJ: Tool for functional analysis of gene expression data sets , 2005, BMC Bioinformatics.

[26]  Atul J. Butte,et al.  Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges , 2012, PLoS Comput. Biol..

[27]  Ronald W. Davis,et al.  Transcriptional regulation and function during the human cell cycle , 2001, Nature Genetics.

[28]  Peng Yang,et al.  Components of the antigen processing and presentation pathway revealed by gene expression microarray analysis following B cell antigen receptor (BCR) stimulation , 2006, BMC Bioinformatics.

[29]  Korbinian Strimmer,et al.  BMC Bioinformatics BioMed Central Methodology article A general modular framework for gene set enrichment analysis , 2009 .

[30]  Jing Cao,et al.  GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach , 2010, Bioinform..

[31]  U. Mansmann,et al.  Testing Differential Gene Expression in Functional Groups , 2005, Methods of Information in Medicine.

[32]  L. Crompton,et al.  Current concepts of amino acid and protein metabolism in the mammary gland of the lactating ruminant. , 1998, Journal of dairy science.

[33]  Galina V. Glazko,et al.  A Multivariate Extension of the gene Set Enrichment Analysis , 2007, J. Bioinform. Comput. Biol..

[34]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[35]  C. Baumrucker Amino acid transport systems in bovine mammary tissue. , 1985, Journal of dairy science.

[36]  Peng Xiao,et al.  Hotelling’s T 2 multivariate profiling for detecting differential expression in microarrays , 2005 .

[37]  Analysis of decision-making coefficients of the lint yield of upland cotton (Gossypium hirsutum L.) , 2014, Euphytica.

[38]  C. Lebrilla,et al.  Variations in bovine milk oligosaccharides during early and middle lactation stages analyzed by high-performance liquid chromatography-chip/mass spectrometry. , 2009, Journal of dairy science.

[39]  S. Sonnino,et al.  Gangliosides as components of lipid membrane domains. , 2007, Glycobiology.

[40]  U. Mansmann,et al.  Genomic Profiling , 2005, Methods of Information in Medicine.

[41]  P. Khatri,et al.  Global functional profiling of gene expression. , 2003, Genomics.