Harnessing Clusters for High Performance Computation of Gene Expression Microarray Comparative Analysis

Gene Expression Comparative Analysis allows bio-informatics researchers to discover the functional regulation of genes. This is achieved through comparisons between data-sets representing the quantities of substances in a biological system. Unnatural variations can be introduced during the data collection and digitization process so normalization algorithms must be applied to data before any accurate comparison can be made. There exist many different normalization methods each of which gives a different result. Comparing differently normalized datasets can allow for discovery of crucial regulated genes that may be otherwise hidden due to errors in a single normalization study. In this paper we introduce a web-based software package called EXP-PAC which makes use of a high performance computing platform of computer clusters to run multiple normalization methods in parallel. By generating multiple normalized datasets concurrently, we allow researchers the ability to improve the accuracy of their research with almost no extra time-cost.

[1]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[2]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[4]  Rafael A. Irizarry,et al.  Comparison of Affymetrix GeneChip expression measures , 2006, Bioinform..

[5]  Kaylene J Simpson,et al.  Maternal Regulation of Milk Composition, Milk Production, and Pouch Young Development During Lactation in the Tammar Wallaby (Macropus eugenii )1 , 2003, Biology of reproduction.

[6]  Kevin R. Nicholas,et al.  MammoSapiens: eResearch of the lactation program. , 2008 .

[7]  Paul T. Spellman,et al.  A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB , 2006, BMC Bioinformatics.

[8]  Christophe Lefèvre,et al.  EST-PAC a web package for EST annotation and protein sequence prediction , 2006, Source Code for Biology and Medicine.

[9]  Rafael A. Irizarry,et al.  A Model-Based Background Adjustment for Oligonucleotide Expression Arrays , 2004 .

[10]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[11]  S. Knudsen,et al.  A new non-linear normalization method for reducing variability in DNA microarray experiments , 2002, Genome Biology.

[12]  Dennis B. Troup,et al.  NCBI GEO: mining millions of expression profiles—database and tools , 2004, Nucleic Acids Res..

[13]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[14]  E. Stone,et al.  Systems Genetics of Complex Traits in Drosophila melanogaster , 2009, Nature Genetics.

[15]  Wei-Min Liu,et al.  Robust estimators for expression analysis , 2002, Bioinform..

[16]  Jean Yee Hwa Yang,et al.  Analysis of CDNA Microarray Images , 2001, Briefings Bioinform..

[17]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[18]  M. C. Rudolph,et al.  Key stages in mammary gland development. Secretory activation in the mammary gland: it's not just about milk protein synthesis! , 2007, Breast Cancer Research.

[19]  Wolfgang Gentzsch,et al.  Sun Grid Engine: towards creating a compute power grid , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[20]  Kevin R. Nicholas,et al.  MammoSapiens: eResearch of the lactation program. Building online facilities for collaborative molecular and evolutionary analysis of lactation and other biological systems from gene sequences and gene expression data. , 2008 .

[21]  W. Weichert,et al.  A prognostic gene expression index in ovarian cancer—validation across different independent data sets , 2009, The Journal of pathology.

[22]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .