Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification

Circulating microRNAs (miRNA) are small noncoding RNA molecules that can be detected in bodily fluids without the need for major invasive procedures on patients. miRNAs have shown great promise as biomarkers for tumors to both assess their presence and to predict their type and subtype. Recently, thanks to the availability of miRNAs datasets, machine learning techniques have been successfully applied to tumor classification. The results, however, are difficult to assess and interpret by medical experts because the algorithms exploit information from thousands of miRNAs. In this work, we propose a novel technique that aims at reducing the necessary information to the smallest possible set of circulating miRNAs. The dimensionality reduction achieved reflects a very important first step in a potential, clinically actionable, circulating miRNA-based precision medicine pipeline. While it is currently under discussion whether this first step can be taken, we demonstrate here that it is possible to perform classification tasks by exploiting a recursive feature elimination procedure that integrates a heterogeneous ensemble of high-quality, state-of-the-art classifiers on circulating miRNAs. Heterogeneous ensembles can compensate inherent biases of classifiers by using different classification algorithms. Selecting features then further eliminates biases emerging from using data from different studies or batches, yielding more robust and reliable outcomes. The proposed approach is first tested on a tumor classification problem in order to separate 10 different types of cancer, with samples collected over 10 different clinical trials, and later is assessed on a cancer subtype classification task, with the aim to distinguish triple negative breast cancer from other subtypes of breast cancer. Overall, the presented methodology proves to be effective and compares favorably to other state-of-the-art feature selection methods.

[1]  Ben Davidson,et al.  Global miRNA expression analysis of serous and clear cell ovarian carcinomas identifies differentially expressed miRNAs including miR-200c-3p as a prognostic marker , 2014, BMC Cancer.

[2]  Thibault Helleputte,et al.  Robust biomarker identification for cancer diagnosis with ensemble feature selection methods , 2010, Bioinform..

[3]  Hiroshi Kono,et al.  miR-122-5p as a novel biomarker for alpha-fetoprotein-producing gastric cancer , 2018, World journal of gastrointestinal oncology.

[4]  Ash A. Alizadeh,et al.  Plasma miR-21, miR-155, miR-10b, and Let-7a as the potential biomarkers for the monitoring of breast cancer patients , 2018, Scientific Reports.

[5]  Estrid Høgdall,et al.  Early metastatic colorectal cancers show increased tissue expression of miR-17/92 cluster members in the invasive tumor front. , 2018, Human pathology.

[6]  Fan Li,et al.  Overexpression of E2F mRNAs Associated with Gastric Cancer Progression Identified by the Transcription Factor and miRNA Co-Regulatory Network Analysis , 2015, PloS one.

[7]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[8]  A. Stegh Targeting the p53 signaling pathway in cancer therapy – the promises, challenges and perils , 2012, Expert opinion on therapeutic targets.

[9]  Jianjun Chen,et al.  MicroRNAs in cancer biology and therapy: Current status and perspectives , 2014, Genes & diseases.

[10]  Xiaoyi Mi,et al.  MiR-630 inhibits cells migration and invasion by targeting SOX 4 in triple-negative breast cancer , 2016 .

[11]  M. Newton,et al.  Genes Involved in DNA Repair and Nitrosamine Metabolism and Those Located on Chromosome 14q32 Are Dysregulated in Nasopharyngeal Carcinoma , 2006, Cancer Epidemiology Biomarkers & Prevention.

[12]  Xiao-guang Liu,et al.  High expression of serum miR-21 and tumor miR-200c associated with poor prognosis in patients with lung cancer , 2012, Medical Oncology.

[13]  Yan He,et al.  Identification of Serum MicroRNAs as Novel Biomarkers in Esophageal Squamous Cell Carcinoma Using Feature Selection Algorithms , 2019, Front. Oncol..

[14]  Qiang Wang,et al.  MiRNAs as Biomarkers of Myocardial Infarction: A Meta-Analysis , 2014, PloS one.

[15]  Taek-Kyun Kim,et al.  Current State of Circulating MicroRNAs as Cancer Biomarkers. , 2015, Clinical chemistry.

[16]  Zhonghu Bai,et al.  Breast cancer intrinsic subtype classification, clinical use and future trends. , 2015, American journal of cancer research.

[17]  Francesco Falciani,et al.  GALGO: an R package for multivariate variable selection using genetic algorithms , 2006, Bioinform..

[18]  David S. Marco Antonio,et al.  MiR-708-5p as a Predictive Marker of Colorectal Cancer Prognosis , 2016 .

[19]  Sheau-Fang Yang,et al.  Periostin overexpression is associated with worse prognosis in nasopharyngeal carcinoma from endemic area: a cohort study , 2018, OncoTargets and therapy.

[20]  Peng Huang,et al.  Identification of MicroRNA‐214 as a negative regulator of colorectal cancer liver metastasis by way of regulation of fibroblast growth factor receptor 1 expression , 2014, Hepatology.

[21]  V. Kim,et al.  The Drosha-DGCR8 complex in primary microRNA processing. , 2004, Genes & development.

[22]  Francisco Martínez,et al.  Identification of miR-187 and miR-182 as biomarkers of early diagnosis and prognosis in patients with prostate cancer treated with radical prostatectomy. , 2014, The Journal of urology.

[23]  Federico Ambrogi,et al.  Assessing Agreement between miRNA Microarray Platforms , 2014, Microarrays.

[24]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[25]  M. Blasco,et al.  Cellular Senescence in Cancer and Aging , 2007, Cell.

[26]  F. Slack,et al.  miRNA modulation of the cellular stress response. , 2008, Future oncology.

[27]  Jiang Lin,et al.  Association between mir-24 and mir-378 in formalin-fixed paraffin-embedded tissues of breast cancer. , 2014, International journal of clinical and experimental pathology.

[28]  Hong Cao,et al.  Low-level expression of let-7a in gastric cancer and its involvement in tumorigenesis by targeting RAB40C. , 2011, Carcinogenesis.

[29]  Sercan Ergün,et al.  The association of the expression of miR-122-5p and its target ADAM10 with human breast cancer , 2014, Molecular Biology Reports.

[30]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[31]  Yu-Hang Zhang,et al.  Identifying circulating miRNA biomarkers for early diagnosis and monitoring of lung cancer. , 2020, Biochimica et biophysica acta. Molecular basis of disease.

[32]  Bo Zhu,et al.  Identification of a serum microRNA expression signature for detection of lung cancer, involving miR-23b, miR-221, miR-148b and miR-423-3p. , 2017, Lung cancer.

[33]  M. Tewari,et al.  MicroRNA profiling: approaches and considerations , 2012, Nature Reviews Genetics.

[34]  Osamu Hori,et al.  Cellular Stress Responses: Cell Survival and Cell Death , 2010, International journal of cell biology.

[35]  Rajvir Dahiya,et al.  MicroRNA-708 induces apoptosis and suppresses tumorigenicity in renal cancer cells. , 2011, Cancer research.

[36]  G. Cheng,et al.  Circulating miRNAs: roles in cancer diagnosis, prognosis and therapy. , 2015, Advanced drug delivery reviews.

[37]  Fang Wu,et al.  Upregulated exosomic miR-23b-3p plays regulatory roles in the progression of pancreatic cancer , 2017, Oncology reports.

[38]  Pierosandro Tagliaferri,et al.  miR-221 stimulates breast cancer cells and cancer-associated fibroblasts (CAFs) through selective interference with the A20/c-Rel/CTGF signaling , 2018, Journal of Experimental & Clinical Cancer Research.

[39]  Verónica Bolón-Canedo,et al.  Ensemble feature selection: Homogeneous and heterogeneous approaches , 2017, Knowl. Based Syst..

[40]  Subrata Sen,et al.  Tumor-Associated Circulating MicroRNAs as Biomarkers of Cancer , 2014, Molecules.

[41]  Guo-Qiang Chen,et al.  MiR-630 suppresses breast cancer progression by targeting metadherin , 2015, Oncotarget.

[42]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[43]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[44]  Meng Li,et al.  Topologically inferring active miRNA‐mediated subpathways toward precise cancer classification by directed random walk , 2019, Molecular oncology.

[45]  Wei Cai,et al.  miR-708 promotes the development of bladder carcinoma via direct repression of Caspase-2 , 2013, Journal of Cancer Research and Clinical Oncology.

[46]  Yang Zhao,et al.  A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data , 2015, Comput. Math. Methods Medicine.

[47]  Yisrael Parmet,et al.  Differences in microRNA detection levels are technology and sequence dependent. , 2013, RNA.

[48]  Anindya Dutta,et al.  The tumor suppressor microRNA let-7 represses the HMGA2 oncogene. , 2007, Genes & development.

[49]  Yong Li,et al.  The p53 Pathway Encounters the MicroRNA World , 2009, Current genomics.

[50]  Juan Liu,et al.  MicroRNA Control of p53 , 2017, Journal of cellular biochemistry.

[51]  Benjamin Piwowarski,et al.  Evolutionary optimization of convolutional neural networks for cancer miRNA biomarkers classification , 2018, Appl. Soft Comput..

[52]  Hiroshi I. Suzuki,et al.  Modulation of microRNA processing by p53 , 2009, Nature.

[53]  Hosuk Lee,et al.  Biogenesis and regulation of the let-7 miRNAs and their functional implications , 2015, Protein & Cell.

[54]  Chang-Hua Kou,et al.  Downregulation of mir-23b in plasma is associated with poor prognosis in patients with colorectal cancer. , 2016, Oncology letters.

[55]  C. Lutz,et al.  miR-708-5p: a microRNA with emerging roles in cancer , 2017, Oncotarget.

[56]  Hongwei Liang,et al.  miR-143 and miR-145 synergistically regulate ERBB3 to suppress cell proliferation and invasion in breast cancer , 2014, Molecular Cancer.

[57]  F. Lovat,et al.  Non-Coding RNAs and Cancer , 2013, International journal of molecular sciences.

[58]  Leng Han,et al.  Integrated genomic analysis of recurrence-associated small non-coding RNAs in oesophageal cancer , 2016, Gut.

[59]  Maria Peña-Chilet,et al.  MicroRNA profile in very young women with breast cancer , 2014, BMC Cancer.

[60]  N. Villegas-Sepúlveda,et al.  New insights into radioresistance in breast cancer identify a dual function of miR‐122 as a tumor suppressor and oncomiR , 2019, Molecular oncology.

[61]  Christina Backes,et al.  Evaluating the Use of Circulating MicroRNA Profiles for Lung Cancer Detection in Symptomatic Patients. , 2020, JAMA oncology.

[62]  Jing Li,et al.  miR-145 inhibits breast cancer cell growth through RTKN. , 2009, International journal of oncology.

[63]  Alberto Inga,et al.  A Cross-Platform Comparison of Affymetrix and Agilent Microarrays Reveals Discordant miRNA Expression in Lung Tumors of c-Raf Transgenic Mice , 2013, PloS one.

[64]  Zhifu Sun,et al.  Increased miR-708 Expression in NSCLC and Its Association with Poor Survival in Lung Adenocarcinoma from Never Smokers , 2012, Clinical Cancer Research.

[65]  Q. Zou,et al.  Cancer Diagnosis Through IsomiR Expression with Machine Learning Method , 2016 .

[66]  Maryam Zare,et al.  Aberrant miRNA promoter methylation and EMT‐involving miRNAs in breast cancer metastasis: Diagnosis and therapeutic implications , 2018, Journal of cellular physiology.

[67]  Yazeed A Al-Sheikh,et al.  Expression profiling of selected microRNA signatures in plasma and tissues of Saudi colorectal cancer patients by qPCR , 2015, Oncology letters.

[68]  G. Mills,et al.  miR-145 participates with TP53 in a death-promoting regulatory loop and targets estrogen receptor-α in human breast cancer cells , 2010, Cell Death and Differentiation.

[69]  Alessandro Salvi,et al.  Functional Role of microRNA-23b-3p in Cancer Biology. , 2018, MicroRNA.

[70]  J. Mandrekar Receiver operating characteristic curve in diagnostic test assessment. , 2010, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[71]  S. Zhong,et al.  miR-221/222: promising biomarkers for breast cancer , 2013, Tumor Biology.

[72]  Kui Liu,et al.  Let-7a inhibits growth and migration of breast cancer cells by targeting HMGA1. , 2015, International journal of oncology.

[73]  Jian Zhang,et al.  Exosomal miR-221/222 enhances tamoxifen resistance in recipient ER-positive breast cancer cells , 2014, Breast Cancer Research and Treatment.

[74]  S. Batra,et al.  MUC4 as a diagnostic marker in cancer. , 2008, Expert opinion on medical diagnostics.

[75]  Ru-Fang Yeh,et al.  TRPS1 Targeting by miR-221/222 Promotes the Epithelial-to-Mesenchymal Transition in Breast Cancer , 2011, Science Signaling.

[76]  Francesca Demichelis,et al.  Epigenetic repression of miR-31 disrupts androgen receptor homeostasis and contributes to prostate cancer progression. , 2013, Cancer research.

[77]  Domenico Coppola,et al.  MicroRNA-221/222 Negatively Regulates Estrogen Receptorα and Is Associated with Tamoxifen Resistance in Breast Cancer* , 2008, Journal of Biological Chemistry.

[78]  A. Šimundić Measures of Diagnostic Accuracy: Basic Definitions , 2009, EJIFCC.

[79]  C. Lawrie,et al.  New Concepts in Cancer Biomarkers: Circulating miRNAs in Liquid Biopsies , 2016, International journal of molecular sciences.

[80]  Jing-Tao Huang,et al.  MicroRNA Machinery Genes as Novel Biomarkers for Cancer , 2014, Front. Oncol..

[81]  Minoru Kanehisa,et al.  KEGG: new perspectives on genomes, pathways, diseases and drugs , 2016, Nucleic Acids Res..

[82]  Domenico Coppola,et al.  MicroRNA-221/222 negatively regulates estrogen receptor α and is associated with tamoxifen resistance in breast cancer. , 2016, The Journal of Biological Chemistry.

[83]  S. Knudsen,et al.  A 4-miRNA signature to predict survival in glioblastomas , 2017, PloS one.

[84]  Songbo Xie,et al.  Therapeutic targeting of cellular stress responses in cancer , 2018, Thoracic cancer.

[85]  S. Pfeffer,et al.  miR-122--a key factor and therapeutic target in liver disease. , 2015, Journal of hepatology.

[86]  P. M. Das,et al.  Downregulation of miR-342 is associated with tamoxifen resistant breast tumors , 2010, Molecular Cancer.

[87]  Kalpana Ghoshal,et al.  miR-122 is a unique molecule with great potential in diagnosis, prognosis of liver disease, and therapy both as miRNA mimic and antimir. , 2015, Current gene therapy.

[88]  Jun Zhang,et al.  Up-Regulation of Plasma miR-23b is Associated with Poor Prognosis of Gastric Cancer , 2016, Medical science monitor : international medical journal of experimental and clinical research.

[89]  V. Kim MicroRNA biogenesis: coordinated cropping and dicing , 2005, Nature Reviews Molecular Cell Biology.

[90]  Robert Gentleman,et al.  Using GOstats to test gene lists for GO term association , 2007, Bioinform..

[91]  Merve Mutlu,et al.  miR-200c: a versatile watchdog in cancer progression, EMT, and drug resistance , 2016, Journal of Molecular Medicine.

[92]  Jin-hai Tang,et al.  miR-342 is associated with estrogen receptor-α expression and response to tamoxifen in breast cancer , 2013, Experimental and therapeutic medicine.

[93]  M. Iorio,et al.  Loss of function of miR-342-3p results in MCT1 over-expression and contributes to oncogenic metabolic reprogramming in triple negative breast cancer , 2018, Scientific Reports.

[94]  Wei Zhao,et al.  Induction of microRNA-let-7a inhibits lung adenocarcinoma cell growth by regulating cyclin D1 , 2018, Oncology reports.

[95]  E. Howe,et al.  Loss of miR-200c: A Marker of Aggressiveness and Chemoresistance in Female Reproductive Cancers , 2009, Journal of oncology.

[96]  Cesare Furlanello,et al.  A Comparison of MCC and CEN Error Measures in Multi-Class Prediction , 2010, PloS one.

[97]  Sriparna Saha,et al.  A Stack-based Ensemble Framework for Detecting Cancer MicroRNA Biomarkers , 2017, Genom. Proteom. Bioinform..

[98]  Alberto Tonda,et al.  Automatic discovery of 100-miRNA signature for cancer classification using ensemble feature selection , 2019, BMC Bioinformatics.

[99]  Nicholas Bertos,et al.  miR-378(∗) mediates metabolic shift in breast cancer cells via the PGC-1β/ERRγ transcriptional pathway. , 2010, Cell metabolism.

[100]  Alejandro Lopez-Rincon,et al.  Ensemble Feature Selection and Meta-Analysis of Cancer miRNA Biomarkers , 2018, bioRxiv.

[101]  Zheng Wang,et al.  miR‐122‐5p promotes aggression and epithelial‐mesenchymal transition in triple‐negative breast cancer by suppressing charged multivesicular body protein 3 through mitogen‐activated protein kinase signaling , 2020, Journal of cellular physiology.

[102]  V. Kim,et al.  MicroRNA maturation: stepwise processing and subcellular localization , 2002, The EMBO journal.

[103]  Wang Jing,et al.  Plasma microRNA profiles for bladder cancer detection. , 2013, Urologic oncology.

[104]  Paula Ribeiro,et al.  miRNet - dissecting miRNA-target interactions and functional associations through network-based visual analysis , 2016, Nucleic Acids Res..