Classification of dendritic cell phenotypes from gene expression data

BackgroundThe selection of relevant genes for sample classification is a common task in many gene expression studies. Although a number of tools have been developed to identify optimal gene expression signatures, they often generate gene lists that are too long to be exploited clinically. Consequently, researchers in the field try to identify the smallest set of genes that provide good sample classification. We investigated the genome-wide expression of the inflammatory phenotype in dendritic cells. Dendritic cells are a complex group of cells that play a critical role in vertebrate immunity. Therefore, the prediction of the inflammatory phenotype in these cells may help with the selection of immune-modulating compounds.ResultsA data mining protocol was applied to microarray data for murine cell lines treated with various inflammatory stimuli. The learning and validation data sets consisted of 155 and 49 samples, respectively. The data mining protocol reduced the number of probe sets from 5,802 to 10, then from 10 to 6 and finally from 6 to 3. The performances of a set of supervised classification models were compared. The best accuracy, when using the six following genes --Il12b, Cd40, Socs3, Irgm1, Plin2 and Lgals3bp-- was obtained by Tree Augmented Naïve Bayes and Nearest Neighbour (91.8%). Using the smallest set of three genes --Il12b, Cd40 and Socs3-- the performance remained satisfactory and the best accuracy was with Support Vector Machine (95.9%). These data mining models, using data for the genes Il12b, Cd40 and Socs3, were validated with a human data set consisting of 27 samples. Support Vector Machines (71.4%) and Nearest Neighbour (92.6%) gave the worst performances, but the remaining models correctly classified all the 27 samples.ConclusionsThe genes selected by the data mining protocol proposed were shown to be informative for discriminating between inflammatory and steady-state phenotypes in dendritic cells. The robustness of the data mining protocol was confirmed by the accuracy for a human data set, when using only the following three genes: Il12b, Cd40 and Socs3. In summary, we analysed the longitudinal pattern of expression in dendritic cells stimulated with activating agents with the aim of identifying signatures that would predict or explain the dentritic cell response to an inflammatory agent.

[1]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[2]  W. Markesbery,et al.  Incipient Alzheimer's disease: Microarray correlation analyses reveal major transcriptional and tumor suppressor responses , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[3]  J. D. Vos,et al.  Comparison of gene expression profiling between malignant and normal plasma cells with oligonucleotide arrays , 2002, Oncogene.

[4]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[5]  David G. Stork,et al.  Pattern Classification , 1973 .

[6]  M. Foti,et al.  Gene expression profiling of dendritic cells by microarray. , 2007, Methods in molecular biology.

[7]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[8]  Satoru Kuhara,et al.  Recursive gene selection based on maximum margin criterion: a comparison with SVM-RFE , 2006, BMC Bioinformatics.

[9]  R. Steinman,et al.  Antigen-bearing immature dendritic cells induce peptide-specific CD8(+) regulatory T cells in vivo in humans. , 2002, Blood.

[10]  E. Clark,et al.  The role of CD40 and CD154/CD40L in dendritic cells. , 2009, Seminars in immunology.

[11]  Mauro Delorenzi,et al.  Mutant huntingtin's effects on striatal gene expression in mice recapitulate changes observed in human Huntington's disease brain and do not differ with mutant huntingtin length or wild-type huntingtin dosage. , 2007, Human molecular genetics.

[12]  H. Waldmann,et al.  Directed differentiation of dendritic cells from mouse embryonic stem cells , 2000, Current Biology.

[13]  D. Simpson,et al.  Individual mRNA expression profiles reveal the effects of specific microRNAs , 2008, Genome Biology.

[14]  A. Palucka,et al.  Dendritic cells as therapeutic vaccines against cancer , 2005, Nature Reviews Immunology.

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  T. Golub,et al.  The molecular signature of mediastinal large B-cell lymphoma differs from that of other diffuse large B-cell lymphomas and shares features with classical Hodgkin lymphoma. , 2003, Blood.

[17]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[18]  M. Xiong,et al.  Recursive partitioning for tumor classification with gene expression microarray data , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  O. Beretta,et al.  Gene Expression Profiles Identify Inflammatory Signatures in Dendritic Cells , 2010, PloS one.

[20]  J. Chimka Categorical Data Analysis, Second Edition , 2003 .

[21]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[22]  M. Foti,et al.  CD14 regulates the dendritic cell life cycle after LPS exposure through NFAT activation , 2009, Nature.

[23]  M. Foti,et al.  Generation of murine growth factor-dependent long-term dendritic cell lines to investigate host-parasite interactions. , 2009, Methods in molecular biology.

[24]  K. Deb,et al.  Reliable classification of two-class cancer data using evolutionary algorithms. , 2003, Bio Systems.

[25]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[26]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[27]  A. Bird,et al.  Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals , 2003, Nature Genetics.

[28]  J. Kononen,et al.  Tissue microarrays for high-throughput molecular profiling of tumor specimens , 1998, Nature Medicine.

[29]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[30]  A. Dalgleish,et al.  The gene expression profile of unstimulated dendritic cells can be used as a predictor of function , 2012, International journal of cancer.

[31]  F. Marincola,et al.  Quality assessment of cellular therapies: the emerging role of molecular assays , 2010, The Korean journal of hematology.

[32]  Ranjeny Thomas,et al.  Recent advances on the role of CD40 and dendritic cells in immunity and tolerance , 2003, Current opinion in hematology.

[33]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[34]  Adan Chari Jirmo,et al.  Preconditioning therapy with lentiviral vector-programmed dendritic cells accelerates the homeostatic expansion of antigen-reactive human T cells in NOD.Rag1-/-.IL-2rγc-/- mice. , 2011, Human gene therapy.

[35]  M. Russo,et al.  What kind of message does IL-12/IL-23 bring to macrophages and dendritic cells? , 2004, Microbes and infection.

[36]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[37]  Ping Jin,et al.  Molecular signatures of maturing dendritic cells: implications for testing the quality of dendritic cell therapies , 2010, Journal of Translational Medicine.

[38]  Wei Pan,et al.  A comparative study of discriminating human heart failure etiology using gene expression profiles , 2005, BMC Bioinformatics.

[39]  Philip D. Butcher,et al.  Probing Host Pathogen Cross-Talk by Transcriptional Profiling of Both Mycobacterium tuberculosis and Infected Human Dendritic Cells and Macrophages , 2008, PloS one.

[40]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[41]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[42]  Francesca Granucci,et al.  Maturation Stages of Mouse Dendritic Cells in Growth Factor–dependent Long-Term Cultures , 1997, The Journal of experimental medicine.

[43]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[44]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Yuhai Tu,et al.  Identification of a global gene expression signature of B-chronic lymphocytic leukemia. , 2003, Molecular cancer research : MCR.

[46]  O. Beretta,et al.  Dendritic cells in pathogen recognition and induction of immune responses: a functional genomics approach , 2006, Journal of leukocyte biology.

[47]  J. Stuart Aitken,et al.  Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes , 2005, BMC Bioinformatics.

[48]  S. Friend,et al.  Cancer Biomarkers—An Invitation to the Table , 2006, Science.

[49]  O. Hobert Gene Regulation by Transcription Factors and MicroRNAs , 2008, Science.

[50]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.

[51]  S. Senju,et al.  Generation of dendritic cells and macrophages from human induced pluripotent stem cells aiming at cell therapy , 2011, Gene Therapy.

[52]  Li Liu,et al.  Improved breast cancer prognosis through the combination of clinical and genetic markers , 2007, Bioinform..

[53]  R. Rottapel,et al.  Putting out the fire: coordinated suppression of the innate and adaptive immune systems by SOCS1 and SOCS3 proteins , 2008, Immunological reviews.

[54]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.