MMAD: microarray microdissection with analysis of differences is a computational tool for deconvoluting cell type-specific contributions from tissue samples

BACKGROUND One of the significant obstacles in the development of clinically relevant microarray-derived biomarkers and classifiers is tissue heterogeneity. Physical cell separation techniques, such as cell sorting and laser-capture microdissection, can enrich samples for cell types of interest, but are costly, labor intensive and can limit investigation of important interactions between different cell types. RESULTS We developed a new computational approach, called microarray microdissection with analysis of differences (MMAD), which performs microdissection in silico. Notably, MMAD (i) allows for simultaneous estimation of cell fractions and gene expression profiles of contributing cell types, (ii) adjusts for microarray normalization bias, (iii) uses the corrected Akaike information criterion during model optimization to minimize overfitting and (iv) provides mechanisms for comparing gene expression and cell fractions between samples in different classes. Computational microdissection of simulated and experimental tissue mixture datasets showed tight correlations between predicted and measured gene expression of pure tissues as well as tight correlations between reported and estimated cell fraction for each of the individual cell types. In simulation studies, MMAD showed superior ability to detect differentially expressed genes in mixed tissue samples when compared with standard metrics, including both significance analysis of microarrays and cell type-specific significance analysis of microarrays. CONCLUSIONS We have developed a new computational tool called MMAD, which is capable of performing robust tissue microdissection in silico, and which can improve the detection of differentially expressed genes. MMAD software as implemented in MATLAB is publically available for download at http://sourceforge.net/projects/mmad/.

[1]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[2]  Geoffrey Childs,et al.  Molecular classification of head and neck squamous cell carcinoma using cDNA microarrays , 2001, Nature Genetics.

[3]  J. Szustakowski,et al.  Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples , 2011, PloS one.

[4]  Monica Chiogna,et al.  A comparison on effects of normalisations in the detection of differentially expressed genes , 2009, BMC Bioinformatics.

[5]  S. Dhanasekaran,et al.  Delineation of prognostic biomarkers in prostate cancer , 2001, Nature.

[6]  Z. Modrušan,et al.  Deconvolution of Blood Microarray Data Identifies Cellular Activation Patterns in Systemic Lupus Erythematosus , 2009, PloS one.

[7]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Aleksey A. Nakorchevskiy,et al.  Expression deconvolution: A reinterpretation of DNA microarray data reveals dynamic changes in cell populations , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Alan Mackay,et al.  The effect of the stromal component of breast tumours on prediction of clinical outcome using gene expression microarray analysis , 2006, Breast Cancer Research.

[11]  Zhandong Liu,et al.  Gene expression deconvolution in linear space , 2011, Nature Methods.

[12]  Hugues Bersini,et al.  Separation of samples into their constituents using gene expression data , 2001, ISMB.

[13]  M. Hellmich,et al.  Comparison of different isolation techniques prior gene expression profiling of blood derived cells: impact on physiological responses, on overall expression and the role of different cell types , 2004, The Pharmacogenomics Journal.

[14]  Min Wang,et al.  Computational expression deconvolution in a complex mammalian organ , 2006, BMC Bioinformatics.

[15]  Wei-Min Liu,et al.  Robust estimators for expression analysis , 2002, Bioinform..

[16]  Judy M. Anderson,et al.  A Six-Gene Signature Predicts Survival of Patients with Localized Pancreatic Ductal Adenocarcinoma , 2010, PLoS medicine.

[17]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[18]  Renaud Gaujoux August Semi-Supervised Nonnegative Matrix Factorization for Gene Expression Deconvolution: a Case Study , 2011 .

[19]  Steven H. Kleinstein,et al.  Cell subset prediction for blood genomic studies , 2012 .

[20]  Zhiyuan Hu,et al.  Systematic Bias in Genomic Classification Due to Contaminating Non-neoplastic Tissue in Breast Tumor Samples , 2011, BMC Medical Genomics.

[21]  Ting Gong,et al.  DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data , 2013, Bioinform..

[22]  Metin N. Gurcan,et al.  Pten in Stromal Fibroblasts Suppresses Mammary Epithelial Tumors , 2009, Nature.

[23]  N. Sampas,et al.  Molecular classification of cutaneous malignant melanoma by gene expression profiling , 2000, Nature.

[24]  Björn Nilsson,et al.  A Framework for Regularized Non-Negative Matrix Factorization, with Application to the Analysis of Gene Expression Data , 2012, PloS one.

[25]  Yi Zhong,et al.  Digital sorting of complex tissues for cell type-specific gene expression profiles , 2013, BMC Bioinformatics.

[26]  Wenyi Wang,et al.  DeMix: deconvolution for mixed cancer transcriptomes using raw measured data , 2013, Bioinform..

[27]  J. Wang-Rodriguez,et al.  In silico dissection of cell-type-associated patterns of gene expression in prostate cancer. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[29]  Renaud Gaujoux,et al.  CellMix: a comprehensive toolbox for gene expression , 2013 .

[30]  Jérôme Couturier,et al.  articleA genomic and transcriptomic approach for a differential diagnosis between primary and secondary ovarian carcinomas in patients with a previous history of breast cancer , 2015 .

[31]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Pekka Ruusuvuori,et al.  Probabilistic analysis of gene expression measurements from heterogeneous tissues , 2010, Bioinform..

[33]  Joachim Selbig,et al.  Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach , 2010, BMC Bioinformatics.

[34]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[35]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[36]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[37]  Mark M. Davis,et al.  Cell type–specific gene expression differences in complex tissues , 2010, Nature Methods.

[38]  Jason Laramie,et al.  Whole blood and leukocyte RNA isolation for gene expression analyses. , 2004, Physiological genomics.

[39]  Renaud Gaujoux,et al.  CellMix: a comprehensive toolbox for gene expression deconvolution , 2013, Bioinform..

[40]  Ilya Shmulevich,et al.  In silico microdissection of microarray data from heterogeneous cell populations , 2005, BMC Bioinformatics.