Identifying Biologically Significant Pathways by Gene Set Enrichment Analysis Using Fisher's Criterion

Gene set enrichment analysis (GSEA) is a computational method to identify statistically significant gene-sets showing differential expression between two groups. In particular, unlike other previous approaches, this enables us to uncover their biological meanings in an elegant way by providing a unified analytical framework that employs a priori known biological knowledges along with gene expression profiles during the analysis procedure. For original GSEA, all the genes in a given dataset are ordered by the signal-to-noise ratio of their microarray expression profiles between two groups and then further analyses are proceeded. Despite of its impressive results in previous studies with original GSEA, however, gene ranking by the signal-to-noise ratio makes it difficult to extract both highly up-regulated genes and highly down-regulated genes at a time as significant genes, which may not reflect such situations as incurred in metabolic and signaling pathways. Thus, it is necessary to make further investigation for better finding of biologically significant pathways. To deal with this problem, in this article, we explore the method of gene set enrichment analysis with Fisher's criterion for gene ranking, named FC-GSEA, and evaluate its effects made in leukemia related pathway analyses.

[1]  John D. Potter,et al.  Improving GSEA for Analysis of Biologic Pathways for Differential Gene Expression across a Binary Phenotype , 2007 .

[2]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[3]  K. Becker,et al.  The Genetic Association Database , 2004, Nature Genetics.

[4]  Steen Knudsen Cancer Diagnostics with DNA Microarrays: Knudsen/Cancer Diagnostics with DNA Microarrays , 2006 .

[5]  Tommy W. S. Chow,et al.  Identifying the biologically relevant gene categories based on gene expression and biological data: an example on prostate cancer , 2007, Bioinform..

[6]  Roland Eils,et al.  Group testing for pathway analysis improves comparability of different microarray datasets , 2006, Bioinform..

[7]  T. Golub,et al.  Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response. , 2004, Blood.

[8]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[9]  Susumu Goto,et al.  The KEGG databases at GenomeNet , 2002, Nucleic Acids Res..

[10]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[12]  Paul S. Bradley,et al.  Feature Selection via Mathematical Programming , 1997, INFORMS J. Comput..