Diffuse large B-cell lymphoma classification using linguistic analysis and ensembled artificial neural networks

Abstract The purpose of this study is to apply non-medical methods to classify two types of diffuse large B-cell lymphoma (DLBCL), which are the germinal-center type (GCB) and the activated B-cell type (ABC). The study materials are MicroRNAs (miRNAs) acquired from DLBCL patients. In order to achieve this goal, statistical methods (i.e. linguistic analysis) and engineering method (i.e. ensembled artificial neural networks (EANN)) have been independently used to do qualitative and quantitative analysis. On this basis, a novel noise elimination enhanced algorithm has been proposed to improve the efficiency of linguistic analysis, namely ensembled linguistic analysis. According to the results, the phylogenetic tree can achieve better performance than initial linguistic analysis. On the other hand, EANN model was established to perform the classification quantitatively, and sensitivity analysis (SA) for EANN was carried out to evaluate the significance ranking of the miRNAs and finally select the 5 most important miRNAs. Besides, classical linear and logistic regression models were developed for comparison with EANN classification results. The regression results were evidently worse than EANN model. This study proves that each lymphoma type has a distinctive pattern of miRNAs expression and the miRNAs expression pattern of ABC is more close to white noise than GCB. Both linguistic analysis and EANN model achieved accurate results; however the performance of EANN model for classification is much better. The 5 selected important miRNAs will be helpful for further study.

[1]  Russell G. Death,et al.  An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data , 2004 .

[2]  W. Baxt Application of artificial neural networks to clinical medicine , 1995, The Lancet.

[3]  M R Adams,et al.  A comparison of quantitative structure‐activity relationships for the effect of benzoic and cinnamic acids on Listeria monocytogenes using multiple linear regression, artificial neural network and fuzzy systems , 1997, Journal of applied microbiology.

[4]  Dereje D. Jima,et al.  Patterns of microRNA expression characterize stages of human B-cell differentiation. , 2009, Blood.

[5]  G. Zocchi,et al.  Local cooperativity mechanism in the DNA melting transition. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  D. Weisenburger,et al.  Epidemiology of the non-Hodgkin's lymphomas: distributions of the major subtypes differ by geographic locations. Non-Hodgkin's Lymphoma Classification Project. , 1998, Annals of oncology : official journal of the European Society for Medical Oncology.

[7]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[8]  I. Lossos,et al.  Expression of the human germinal center-associated lymphoma (HGAL) protein identifies a subset of classic Hodgkin lymphoma of germinal center derivation and improved survival. , 2005, Blood.

[9]  D. Baltimore,et al.  NF-κB-dependent induction of microRNA miR-146, an inhibitor targeted to signaling proteins of innate immune responses , 2006, Proceedings of the National Academy of Sciences.

[10]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[11]  Madalena Costa,et al.  Multiscale entropy analysis of biological signals. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Fernando Gustavo Tomasel,et al.  Prediction of functional characteristics of ecosystems: a comparison of artificial neural networks and regression models , 1997 .

[13]  Anton Berns,et al.  Cancer: Gene expression in diagnosis , 2000, Nature.

[14]  S. Manel,et al.  Comparing discriminant analysis, neural networks and logistic regression for predicting species distributions: a case study with a Himalayan river bird , 1999 .

[15]  R. Spang,et al.  A biologic definition of Burkitt's lymphoma from transcriptional and genomic profiling. , 2006, The New England journal of medicine.

[16]  L. Buydens,et al.  Using genetic algorithms for the construction of phylogenetic trees: application to G-protein coupled receptor sequences. , 1999, Bio Systems.

[17]  Sovan Lek,et al.  Improved estimation, using neural networks, of the food consumption of fish populations , 1995 .

[18]  Patrick van der Smagt,et al.  Introduction to neural networks , 1995, The Lancet.

[19]  Richard Simon,et al.  Molecular diagnosis of Burkitt's lymphoma. , 2006, The New England journal of medicine.

[20]  Huey-Wen Yien,et al.  Linguistic analysis of the human heartbeat using frequency and rank order statistics. , 2003, Physical review letters.

[21]  M. David,et al.  MicroRNAs in the immune response. , 2008, Cytokine.

[22]  T. Golub,et al.  The molecular signature of mediastinal large B-cell lymphoma differs from that of other diffuse large B-cell lymphomas and shares features with classical Hodgkin lymphoma. , 2003, Blood.

[23]  Julian D. Olden,et al.  Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks , 2002 .

[24]  Huey-Wen Yien,et al.  Information categorization approach to literary authorship disputes , 2003 .

[25]  I. Bizjak,et al.  Measurement of branching fractions for B-->eta(c)K(*) decays. , 2002, Physical review letters.

[26]  J. Hilbe Logistic Regression Models , 2009 .

[27]  M. Gevrey,et al.  Review and comparison of methods to study the contribution of variables in artificial neural network models , 2003 .

[28]  Maysam F. Abbod,et al.  Brain death prediction based on ensembled artificial neural networks in neurosurgical intensive care unit , 2011 .

[29]  J. Shieh,et al.  Ensembled artificial neural networks to predict the fitness score for body composition analysis , 2011, The journal of nutrition, health & aging.