MicroRNA expression classification for pediatric multiple sclerosis identification

MicroRNAs (miRNAs) are a set of short non-coding RNAs that play significant regulatory roles in cells. The study of miRNA data produced by Next-Generation Sequencing techniques can be of valid help for the analysis of multifactorial diseases, such as Multiple Sclerosis (MS). Although extensive studies have been conducted on young adults affected by MS, very little work has been done to investigate the pathogenic mechanisms in pediatric patients, and none from a machine learning perspective. In this work, we report the experimental results of a classification study aimed at evaluating the effectiveness of machine learning methods in automatically distinguishing pediatric MS from healthy children, based on their miRNA expression profiles. Additionally, since Attention Deficit Hyperactivity Disorder (ADHD) shares some cognitive impairments with pediatric MS, we also included patients affected by ADHD in our study. Encouraging results were obtained with an artificial neural network model based on a set of features automatically selected by feature selection algorithms. The results obtained show that models developed on automatically selected features overcome models based on a set of features selected by human experts. Developing an automatic predictive model can support clinicians in early MS diagnosis and provide new insights that can help find novel molecular pathways involved in MS disease.

[1]  Aziz Guergachi,et al.  Applications of association rule mining in health informatics: a survey , 2017, Artificial Intelligence Review.

[2]  S. Fong,et al.  Artificial intelligence in cancer diagnosis and prognosis: Opportunities and challenges. , 2019, Cancer letters.

[3]  M. Coluccia,et al.  Intelligent Microarray Data Analysis through Non-negative Matrix Factorization to Study Human Multiple Myeloma Cell Lines , 2019, Applied Sciences.

[4]  José M. Alonso,et al.  An Ontology-Based Interpretable Fuzzy Decision Support System for Diabetes Diagnosis , 2018, IEEE Access.

[5]  Gabriella Casalino,et al.  Evaluation of Cognitive Impairment in Pediatric Multiple Sclerosis with Machine Learning: An Exploratory Study of miRNA Expressions , 2020, 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS).

[6]  E. Shortliffe,et al.  Clinical Decision Support in the Era of Artificial Intelligence. , 2018, JAMA.

[7]  Yong Huang,et al.  Biological functions of microRNAs: a review , 2011, Journal of Physiology and Biochemistry.

[8]  Brendan J. Frey,et al.  Machine Learning in Genomic Medicine: A Review of Computational Problems and Data Sets , 2016, Proceedings of the IEEE.

[9]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[10]  Christophe Lemetre,et al.  An introduction to artificial neural networks in bioinformatics - application to complex microarray and mass spectrometry datasets in cancer studies , 2008, Briefings Bioinform..

[11]  Danilo Caivano,et al.  CRISPRLearner: A Deep Learning-Based System to Predict CRISPR/Cas9 sgRNA On-Target Cleavage Efficiency , 2019, Electronics.

[12]  Giovanna Castellano,et al.  Fuzzy mathematical morphology for biological image segmentation , 2014, Applied Intelligence.

[13]  A. Junker,et al.  MicroRNA profiles of MS gray matter lesions identify modulators of the synaptic protein synaptotagmin‐7 , 2019, Brain pathology.

[14]  Riccardo Rizzo,et al.  Deep learning architectures for prediction of nucleosome positioning from sequences data , 2018, BMC Bioinformatics.

[15]  Shu-Ching Chen,et al.  Computational Health Informatics in the Big Data Age , 2016, ACM Comput. Surv..

[16]  S. Liuni,et al.  Combined microRNA and mRNA expression analysis in pediatric multiple sclerosis: an integrated approach to uncover novel pathogenic mechanisms of the disease , 2018, Human molecular genetics.

[17]  T. Tuschl,et al.  Cell and Microvesicle Urine microRNA Deep Sequencing Profiles from Healthy Individuals: Observations with Potential Impact on Biomarker Studies , 2016, PloS one.

[18]  Nicolas Gillis,et al.  Orthogonal joint sparse NMF for microarray data analysis , 2019, Journal of mathematical biology.

[19]  L. Farrer,et al.  Salivary microRNAs identified by small RNA sequencing and machine learning as potential biomarkers of alcohol dependence. , 2019, Epigenomics.

[20]  Q. Zou,et al.  Cancer Diagnosis Through IsomiR Expression with Machine Learning Method , 2016 .

[21]  Davis J. McCarthy,et al.  Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation , 2012, Nucleic acids research.

[22]  Nicola Amoroso,et al.  Association between miRNAs expression and cognitive performances of Pediatric Multiple Sclerosis patients: A pilot study , 2019, Brain and behavior.

[23]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[24]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[25]  David Zhang,et al.  Feature selection and analysis on correlated gas sensor data with recursive feature elimination , 2015 .

[26]  T. Olsson,et al.  Interactions between genetic, lifestyle and environmental risk factors for multiple sclerosis , 2017, Nature Reviews Neurology.

[27]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[28]  E. Mazzon,et al.  Identification of CD4+ T cell biomarkers for predicting the response of patients with relapsing-remitting multiple sclerosis to natalizumab treatment , 2019, Molecular medicine reports.

[29]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[30]  Wojciech Fendler,et al.  Diagnostic potential for a serum miRNA neural network for detection of ovarian cancer , 2017, eLife.

[31]  Corrado Mencar,et al.  Application of machine learning to predict obstructive sleep apnea syndrome severity , 2019, Health Informatics J..

[32]  B. Healy,et al.  Demographics of pediatric-onset multiple sclerosis in an MS center population from the Northeastern United States , 2009, Multiple sclerosis.

[33]  Giovanna Castellano,et al.  A Predictive Model for MicroRNA Expressions in Pediatric Multiple Sclerosis Detection , 2019, MDAI.

[34]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[35]  Franco Alberto Cardillo,et al.  Automatic Approaches for CE-MRI Examination of the Breast: A Survey , 2017, 2017 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData).

[36]  Aboul Ella Hassanien,et al.  Pessimistic multi-granulation rough set-based classification for heart valve disease diagnosis , 2016, Int. J. Model. Identif. Control..

[37]  S. Liuni,et al.  Integrated Analysis of microRNA and mRNA Expression Profiles: An Attempt to Disentangle the Complex Interaction Network in Attention Deficit Hyperactivity Disorder , 2019, Brain sciences.

[38]  P. Sørensen,et al.  Differential microRNA expression in blood in multiple sclerosis , 2013, Multiple sclerosis.

[39]  Giuseppe Coviello,et al.  A Synchronized Multi-Unit Wireless Platform for Long-Term Activity Monitoring , 2020, Electronics.

[40]  Nicola Amoroso,et al.  Communicability Characterization of Structural DWI Subcortical Networks in Alzheimer’s Disease , 2019, Entropy.

[41]  Rahul C. Deo,et al.  Machine Learning of the Cardiac Phenome and Skin Transcriptome to Categorize Heart Disease in Systemic Sclerosis , 2017, bioRxiv.

[42]  Gennaro Vessio,et al.  Dynamic Handwriting Analysis for Neurodegenerative Disease Assessment: A Literary Review , 2019, Applied Sciences.

[43]  Shailendra Aswale,et al.  Data Mining & Artificial Intelligence Techniques for Prediction of Heart Disorders: A Survey , 2019, 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN).

[44]  Sergios Theodoridis,et al.  Pattern Recognition , 1998, IEEE Trans. Neural Networks.

[45]  Sridar Narayanan,et al.  Altered resting-state functional connectivity in cognitively preserved pediatric-onset MS patients and relationship to structural damage and cognitive performance , 2016, Multiple sclerosis.

[46]  Huan Liu,et al.  Feature selection for classification: A review , 2014 .

[47]  Deborah Weisbrot,et al.  Psychiatric diagnoses and cognitive impairment in pediatric multiple sclerosis , 2014, Multiple sclerosis.

[48]  Xiaoning Qian,et al.  Early detection and risk assessment for chronic disease with irregular longitudinal data analysis , 2019, J. Biomed. Informatics.

[49]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[50]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[51]  Massoud Saidijam,et al.  Application of Artificial Neural Network in miRNA Biomarker Selection and Precise Diagnosis of Colorectal Cancer , 2019, Iranian biomedical journal.

[52]  Maria L. Wei,et al.  A machine-learning classifier trained with microRNA ratios to distinguish melanomas from nevi , 2018, bioRxiv.

[53]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[54]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[55]  Miguel Angel Ferrer-Ballester,et al.  Dynamically enhanced static handwriting representation for Parkinson's disease detection , 2019, Pattern Recognit. Lett..

[56]  M. Margaliot,et al.  Pattern Recognition (Theodoridis, S. and Koutroumbas, K.; 2006) [Book reviews] , 2008 .

[57]  Jorge Nocedal,et al.  Remark on “algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization” , 2011, TOMS.