GSMA: Gene Set Matrix Analysis, An Automated Method for Rapid Hypothesis Testing of Gene Expression Data

Background: Microarray technology has become highly valuable for identifying complex global changes in gene expression patterns. The assignment of functional information to these complex patterns remains a challenging task in effectively interpreting data and correlating results from across experiments, projects and laboratories. Methods which allow the rapid and robust evaluation of multiple functional hypotheses increase the power of individual researchers to data mine gene expression data more efficiently. Results: We have developed (gene set matrix analysis) GSMA as a useful method for the rapid testing of group-wise up- or down-regulation of gene expression simultaneously for multiple lists of genes (gene sets) against entire distributions of gene expression changes (datasets) for single or multiple experiments. The utility of GSMA lies in its flexibility to rapidly poll gene sets related by known biological function or as designated solely by the end-user against large numbers of datasets simultaneously. Conclusions: GSMA provides a simple and straightforward method for hypothesis testing in which genes are tested by groups across multiple datasets for patterns of expression enrichment.

[1]  Robert Gentleman,et al.  Using GOstats to test gene lists for GO term association , 2007, Bioinform..

[2]  Seon-Young Kim,et al.  Genome-wide prediction of transcriptional regulatory elements of human promoters using gene expression and promoter analysis data , 2006, BMC Bioinformatics.

[3]  Andrew B. Nobel,et al.  Significance analysis of functional categories in gene expression studies: a structured permutation approach , 2005, Bioinform..

[4]  J. Aubin,et al.  Differentiation of muscle, fat, cartilage, and bone from progenitor cells present in a bone-derived clonal cell population: effect of dexamethasone , 1988, The Journal of cell biology.

[5]  J. Schwartz,et al.  Air pollution and daily mortality: a review and meta analysis. , 1994, Environmental research.

[6]  R. Kingston,et al.  HEB, a helix-loop-helix protein related to E2A and ITF2 that can modulate the DNA-binding ability of myogenic regulatory factors. , 1992, Molecular and cellular biology.

[7]  Seon-Young Kim,et al.  PAGE: Parametric Analysis of Gene Set Enrichment , 2005, BMC Bioinform..

[8]  Brian T. Chait,et al.  E Protein Silencing by the Leukemogenic AML1-ETO Fusion Protein , 2004, Science.

[9]  M. Gorospe,et al.  Global analysis of stress-regulated mRNA turnover by using cDNA arrays , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  M. Gorospe,et al.  Control of gene expression during T cell activation: alternate regulation of mRNA transcription and mRNA stability , 2005, BMC Genomics.

[11]  C. Pin,et al.  The E protein HEB is preferentially expressed in developing muscle. , 2004, Differentiation; research in biological diversity.

[12]  Y. Ohkawa,et al.  Skeletal muscle specification by myogenin and Mef2D via the SWI/SNF ATPase Brg1 , 2006, The EMBO journal.

[13]  M. Gorospe,et al.  Stability Regulation of mRNA and the Control of Gene Expression , 2005, Annals of the New York Academy of Sciences.

[14]  Holger Karas,et al.  TRANSFAC: a database on transcription factors and their DNA binding sites , 1996, Nucleic Acids Res..

[15]  Alexander R. Pico,et al.  GenMAPP 2: new features and resources for pathway analysis , 2007, BMC Bioinformatics.

[16]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[17]  A. Butte,et al.  Creation and implications of a phenome-genome network , 2006, Nature Biotechnology.

[18]  U. Weidle,et al.  The transcriptional program of a human B cell line in response to Myc. , 2001, Nucleic acids research.

[19]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[20]  J. Mesirov,et al.  An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis , 2005, Nature Genetics.

[21]  F. Spitz,et al.  A combination of MEF3 and NFI proteins activates transcription in a subset of fast-twitch muscles , 1997, Molecular and cellular biology.

[22]  W. Wong,et al.  Computational Biology: Toward Deciphering Gene Regulatory Information in Mammalian Genomes , 2006, Biometrics.

[23]  I. Kohane,et al.  Absolute enrichment: gene set enrichment analysis for homeostatic systems , 2006, Nucleic acids research.

[24]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[25]  K. Becker,et al.  A rapid method for microarray cross platform comparisons using gene expression signatures. , 2007, Molecular and cellular probes.

[26]  Sayan Mukherjee,et al.  Analysis of sample set enrichment scores: assaying the enrichment of sets of genes for individual samples in genome-wide expression profiles , 2006, ISMB.

[27]  Thomas Werner,et al.  The State of the Art of Mammalian Promoter Recognition , 2003, Briefings Bioinform..

[28]  ADD1: a novel helix-loop-helix transcription factor associated with adipocyte determination and differentiation. , 1993, Molecular and cellular biology.

[29]  Ronald W. Davis,et al.  Transcriptional regulation and function during the human cell cycle , 2001, Nature Genetics.

[30]  A. Sharrocks,et al.  The identification of elements determining the different DNA binding specificities of the MADS box proteins p67SRF and RSRFC4. , 1993, Nucleic acids research.

[31]  T. Werner,et al.  Computer modeling of promoter organization as a tool to study transcriptional coregulation , 2003, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[32]  Alexander R. Abbas,et al.  Immune response in silico (IRIS): immune-specific genes identified from a compendium of microarray expression data , 2005, Genes and Immunity.

[33]  K. Becker,et al.  Analysis of microarray data using Z score transformation. , 2003, The Journal of molecular diagnostics : JMD.