An Open Source Microarray Data Analysis System with GUI: Quintet

We address Quintet, an R-based unified cDNA microarray data analysis system with GUI. Five principal categories of microarray data analysis have been coherently integrated in Quintet: data processing steps such as faulty spot filtering and normalization, data quality assessment (QA), identification of differentially expressed genes (DEGs), clustering of gene expression profiles, and classification of samples. Though many microarray data analysis systems normally consider DEG identification and clustering/classification the most important problems, we emphasize that data processing and QA are equally important and should be incorporated into the regular-base data analysis practices because microarray data are very noisy. In each analysis category, customized plots and statistical summaries are also given for users convenience. Using these plots and summaries, analysis results can be easily examined for their biological plausibility and compared with other results. Since Quintet is written in R, it is highly extendable so that users can insert new algorithms and experiment them with minimal efforts. Also, the GUI makes it easy to learn and use and since R-language and its GUI engine, Tcl/Tk, are available in all operating systems, Quintet is OS-independent too.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  John Quackenbush Microarray data normalization and transformation , 2002, Nature Genetics.

[3]  David Haussler,et al.  Using the Fisher Kernel Method to Detect Remote Protein Homologies , 1999, ISMB.

[4]  Ying Xu,et al.  Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees , 2002, Bioinform..

[5]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[6]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[7]  M. Q. Zhang Large-scale gene expression data analysis: a new challenge to computational biologists. , 1999, Genome research.

[8]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[9]  G. Gibson,et al.  Microarray Analysis , 2020, Definitions.

[10]  K R Hess,et al.  Microarrays: handling the deluge of data and extracting reliable information. , 2001, Trends in biotechnology.

[11]  Roger E Bumgarner,et al.  Clustering gene-expression data with repeated measurements , 2003, Genome Biology.

[12]  Gunnar Rätsch,et al.  Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.

[13]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[14]  Ken W. Y. Cho,et al.  Microarray optimizations: increasing spot accuracy and automated identification of true microarray signals. , 2002, Nucleic acids research.

[15]  C. Holding SAGE is better than dbEST , 2002, Genome Biology.

[16]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[17]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[18]  G. A. Whitmore,et al.  Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[19]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Michael L. Bittner,et al.  Microarrays: Optical Technologies and Informatics , 2001 .

[21]  Kevin G. Becker,et al.  The sharing of cDNA microarray data , 2001, Nature Reviews Neuroscience.

[22]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[23]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[24]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Debashis Ghosh,et al.  STATISTICAL ISSUES IN THE CLUSTERING OF GENE EXPRESSION DATA , 2001 .

[26]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[27]  G E Archer,et al.  Maximization of signal derived from cDNA microarrays. , 2001, BioTechniques.

[28]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[29]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[30]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[32]  A. Khodursky,et al.  Functional Genomics: Methods And Protocols , 2007 .

[33]  S. Drăghici,et al.  Experimental design, analysis of variance and slide quality assessment in gene expression arrays. , 2001, Current opinion in drug discovery & development.

[34]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[35]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[36]  Jerry Li,et al.  Within the fold: assessing differential expression measures and reproducibility in microarray assays , 2002, Genome Biology.

[37]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[38]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[39]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[40]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[41]  Khanh Nguyen,et al.  Estimation of the confidence limits of oligonucleotide-array-based measurements of differential expression , 2001, SPIE BiOS.

[42]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[43]  Russ B. Altman,et al.  Nonparametric methods for identifying differentially expressed genes in microarray data , 2002, Bioinform..