A novel application of quantile regression for identification of biomarkers exemplified by equine cartilage microarray data

BackgroundIdentification of biomarkers among thousands of genes arrayed for disease classification has been the subject of considerable research in recent years. These studies have focused on disease classification, comparing experimental groups of effected to normal patients. Related experiments can be done to identify tissue-restricted biomarkers, genes with a high level of expression in one tissue compared to other tissue types in the body.ResultsIn this study, cartilage was compared with ten other body tissues using a two color array experimental design. Thirty-seven probe sets were identified as cartilage biomarkers. Of these, 13 (35%) have existing annotation associated with cartilage including several well-established cartilage biomarkers. These genes comprise a useful database from which novel targets for cartilage biology research can be selected. We determined cartilage specific Z-scores based on the observed M to classify genes with Z-scores ≥ 1.96 in all ten cartilage/tissue comparisons as cartilage-specific genes.ConclusionQuantile regression is a promising method for the analysis of two color array experiments that compare multiple samples in the absence of biological replicates, thereby limiting quantifiable error. We used a nonparametric approach to reveal the relationship between percentiles of M and A, where M is log2(R/G) and A is 0.5 log2(RG) with R representing the gene expression level in cartilage and G representing the gene expression level in one of the other 10 tissues. Then we performed linear quantile regression to identify genes with a cartilage-restricted pattern of expression.

[1]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[2]  S. Sathiya Keerthi,et al.  A simple and efficient algorithm for gene selection using sparse logistic regression , 2003, Bioinform..

[3]  Xuming He,et al.  Detecting Differential Expressions in GeneChip Microarray Studies , 2007 .

[4]  R. Koenker Quantile Regression: Name Index , 2005 .

[5]  T. Cai,et al.  Combining Predictors for Classification Using the Area under the Receiver Operating Characteristic Curve , 2006, Biometrics.

[6]  I. Mian,et al.  Exploratory differential gene expression analysis in microarray experiments with no or limited replication , 2004, Genome Biology.

[7]  Wei Chu,et al.  Biomarker discovery in microarray gene expression data with Gaussian processes , 2005, Bioinform..

[8]  Arnold J Stromberg,et al.  Cellular and molecular characterization of oxidative stress in olfactory epithelium of Harlequin mutant mouse , 2008, Journal of neuroscience research.

[9]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[10]  R. Koenker,et al.  Regression Quantiles , 2007 .

[11]  P. Brown,et al.  Parallel human genome analysis: microarray-based expression monitoring of 1000 genes. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Tianxi Cai,et al.  Combining Predictors for Classification Using the Area Under the ROC Curve , 2004 .

[13]  Xuming He,et al.  An Enhanced Quantile Approach for Assessing Differential Gene Expressions , 2008, Biometrics.

[14]  Colin Lin,et al.  Paper 213-30 an Introduction to Quantile Regression and the Quantreg Procedure , 2005 .

[15]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .