Tests for finding complex patterns of differential expression in cancers: towards individualized medicine

BackgroundMicroarray studies in cancer compare expression levels between two or more sample groups on thousands of genes. Data analysis follows a population-level approach (e.g., comparison of sample means) to identify differentially expressed genes. This leads to the discovery of 'population-level' markers, i.e., genes with the expression patterns A > B and B > A. We introduce the PPST test that identifies genes where a significantly large subset of cases exhibit expression values beyond upper and lower thresholds observed in the control samples.ResultsInterestingly, the test identifies A > B and B < A pattern genes that are missed by population-level approaches, such as the t-test, and many genes that exhibit both significant overexpression and significant underexpression in statistically significantly large subsets of cancer patients (ABA pattern genes). These patterns tend to show distributions that are unique to individual genes, and are aptly visualized in a 'gene expression pattern grid'. The low degree of among-gene correlations in these genes suggests unique underlying genomic pathologies and high degree of unique tumor-specific differential expression. We compare the PPST and the ABA test to the parametric and non-parametric t-test by analyzing two independently published data sets from studies of progression in astrocytoma.ConclusionsThe PPST test resulted findings similar to the nonparametric t-test with higher self-consistency. These tests and the gene expression pattern grid may be useful for the identification of therapeutic targets and diagnostic or prognostic markers that are present only in subsets of cancer patients, and provide a more complete portrait of differential expression in cancer.

[1]  A. Knudson Mutation and cancer: statistical study of retinoblastoma. , 1971, Proceedings of the National Academy of Sciences of the United States of America.

[2]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[3]  S. Nelson,et al.  Detection of differentially expressed genes in primary tumor tissues using representational differences analysis coupled to microarray hybridization. , 1998, Nucleic acids research.

[4]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[5]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[6]  N. Sampas,et al.  Molecular classification of cutaneous malignant melanoma by gene expression profiling , 2000, Nature.

[7]  Trey Ideker,et al.  Testing for Differentially-Expressed Genes by Maximum-Likelihood Analysis of Microarray Data , 2000, J. Comput. Biol..

[8]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[9]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[10]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[11]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[12]  D. Hanahan,et al.  The Hallmarks of Cancer , 2000, Cell.

[13]  Pierre Baldi,et al.  A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes , 2001, Bioinform..

[14]  J. Sudbø,et al.  Gene-expression profiles in hereditary breast cancer. , 2001, The New England journal of medicine.

[15]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  D. Lockhart,et al.  Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  J. Thomas,et al.  An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. , 2001, Genome research.

[19]  D. Botstein,et al.  Diversity of gene expression in adenocarcinoma of the lung , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Rainer Fuchs,et al.  Bayesian Estimation of Fold-Changes in the Analysis of Gene Expression: The PFOLD Algorithm , 2001, J. Comput. Biol..

[21]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Pierre R. Bushel,et al.  Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models , 2001, J. Comput. Biol..

[23]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Peter J. Park,et al.  A Nonparametric Scoring Algorithm for Identifying Informative Genes from Microarray Data , 2000, Pacific Symposium on Biocomputing.

[25]  Ash A. Alizadeh,et al.  Towards a novel classification of human malignancies based on gene expression patterns , 2001, The Journal of pathology.

[26]  Yi Li,et al.  Bayesian automatic relevance determination algorithms for classifying gene expression data. , 2002, Bioinformatics.

[27]  J. Ibrahim,et al.  Bayesian Models for Gene Expression With DNA Microarray Data , 2002 .

[28]  D. Hartl,et al.  Bayesian analysis of gene expression levels: statistical quantification of relative mRNA level across multiple strains or treatments , 2002, Genome Biology.

[29]  Jerry Li,et al.  Within the fold: assessing differential expression measures and reproducibility in microarray assays , 2002, Genome Biology.

[30]  John D. Storey A direct approach to false discovery rates , 2002 .

[31]  Russ B. Altman,et al.  Nonparametric methods for identifying differentially expressed genes in microarray data , 2002, Bioinform..

[32]  J. D. Vos,et al.  Comparison of gene expression profiling between malignant and normal plasma cells with oligonucleotide arrays , 2002, Oncogene.

[33]  R. Tibshirani,et al.  Empirical bayes methods and false discovery rates for microarrays , 2002, Genetic epidemiology.

[34]  Sylvia Richardson,et al.  Bayesian Hierarchical Model for Identifying Changes in Gene Expression from Microarray Experiments , 2002, J. Comput. Biol..

[35]  Xiaohong Huang,et al.  Comparing three methods for variance estimation with duplicated high density oligonucleotide arrays , 2002, Functional & Integrative Genomics.

[36]  Pierre R. Bushel,et al.  Computational selection of distinct class- and subclass-specific gene expression signatures , 2002, J. Biomed. Informatics.

[37]  R. W. Doerge,et al.  Calculation of the minimum number of replicate spots required for detection of significant gene expression fold change in microarray experiments , 2002, Bioinform..

[38]  David E. Misek,et al.  Characterization of gene expression profiles associated with glioma progression using oligonucleotide-based microarray analysis and real-time reverse transcription-polymerase chain reaction. , 2003, The American journal of pathology.

[39]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[40]  James Lyons-Weiler,et al.  Overcoming confounded controls in the analysis of gene expression data from microarray experiments. , 2003, Applied bioinformatics.

[41]  C M Kendziorski,et al.  On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles , 2003, Statistics in medicine.

[42]  Soheil Shams,et al.  Noise Sampling Method: An ANOVA Approach Allowing Robust Selection of Differentially Regulated Genes Measured by DNA Microarrays , 2003, Bioinform..

[43]  D. Dressman,et al.  Overexpression of the EGFR/FKBP12/HIF-2alpha pathway identified in childhood astrocytomas by angiogenesis gene profiling. , 2003, Cancer research.

[44]  Marina Vannucci,et al.  Gene selection: a Bayesian variable selection approach , 2003, Bioinform..

[45]  Wei Pan,et al.  On the Use of Permutation in and the Performance of A Class of Nonparametric Methods to Detect Differential Gene Expression , 2003, Bioinform..

[46]  Zhi-jun Tan,et al.  Analysis of gene expression profile of pancreatic carcinoma using cDNA microarray. , 2003, World journal of gastroenterology.

[47]  X. Cui,et al.  Statistical tests for differential expression in cDNA microarray experiments , 2003, Genome Biology.

[48]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[49]  James Lyons-Weiler,et al.  caGEDA: a web application for the integrated analysis of global gene expression patterns in cancer , 2004, Applied bioinformatics.