COVID-19Base: A knowledgebase to explore biomedical entities related to COVID-19

We are presenting COVID-19Base, a knowledgebase highlighting the biomedical entities related to COVID-19 disease based on literature mining. To develop COVID-19Base, we mine the information from publicly available scientific literature and related public resources. We considered seven topic-specific dictionaries, including human genes, human miRNAs, human lncRNAs, diseases, Protein Databank, drugs, and drug side effects, are integrated to mine all scientific evidence related to COVID-19. We have employed an automated literature mining and labeling system through a novel approach to measure the effectiveness of drugs against diseases based on natural language processing, sentiment analysis, and deep learning. To the best of our knowledge, this is the first knowledgebase dedicated to COVID-19, which integrates such large variety of related biomedical entities through literature mining. Proper investigation of the mined biomedical entities along with the identified interactions among those, reported in COVID-19Base, would help the research community to discover possible ways for the therapeutic treatment of COVID-19.

[1]  M. Michael Gromiha,et al.  Computational studies of drug repurposing and synergism of lopinavir, oseltamivir and ritonavir binding with SARS-CoV-2 protease against COVID-19 , 2020, Journal of biomolecular structure & dynamics.

[2]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[3]  A. V. Olgac,et al.  Performance Analysis of Various Activation Functions in Generalized MLP Architectures of Neural Networks , 2011 .

[4]  Huiyu Zhou,et al.  Using deep neural network with small dataset to predict material defects , 2019, Materials & Design.

[5]  Li Han Using a Dynamic K-means Algorithm to Detect Anomaly Activities , 2011, 2011 Seventh International Conference on Computational Intelligence and Security.

[6]  Maxat Kulmanov,et al.  Functional pangenome analysis suggests inhibition of the protein E as a readily available therapy for COVID-2019 , 2020, medRxiv.

[7]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[8]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[9]  Jun Yan,et al.  Large‐scale extraction of drug–disease pairs from the medical literature , 2017, J. Assoc. Inf. Sci. Technol..

[10]  Susan Tweedie,et al.  Genenames.org: the HGNC and VGNC resources in 2017 , 2016, Nucleic Acids Res..

[11]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[12]  Yasser Yasami,et al.  A novel unsupervised classification approach for network anomaly detection by k-Means clustering and ID3 decision tree learning methods , 2010, The Journal of Supercomputing.

[13]  Joachim Denzler,et al.  Deep Learning on Small Datasets without Pre-Training using Cosine Loss , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[14]  Zhongming Zhao,et al.  VISDB: a manually curated database of viral integration sites in the human genome , 2019, Nucleic Acids Res..

[15]  Michelle Giglio,et al.  Human Disease Ontology 2018 update: classification, content and workflow expansion , 2018, Nucleic Acids Res..

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Mark Chappell,et al.  A crucial role of angiotensin converting enzyme 2 (ACE2) in SARS coronavirus–induced lung injury , 2005, Nature Medicine.

[18]  Gary B. Wills,et al.  Unsupervised Clustering Approach for Network Anomaly Detection , 2012, NDT.

[19]  Hans Peter Luhn,et al.  A Statistical Approach to Mechanized Encoding and Searching of Literary Information , 1957, IBM J. Res. Dev..

[20]  Kwong-Sak Leung,et al.  ViRBase: a resource for virus–host ncRNA-associated interactions , 2014, Nucleic Acids Res..

[21]  Y. Hu,et al.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China , 2020, The Lancet.

[22]  A Lavecchia,et al.  Virtual screening strategies in drug discovery: a critical review. , 2013, Current medicinal chemistry.

[23]  D. Heymann,et al.  COVID-19: what is next for public health? , 2020, The Lancet.

[24]  Ivan Griffin,et al.  COVID-19: combining antiviral and anti-inflammatory treatments , 2020, The Lancet Infectious Diseases.

[25]  Peer Bork,et al.  The SIDER database of drugs and side effects , 2015, Nucleic Acids Res..

[26]  Hans-Peter Kriegel,et al.  A survey on unsupervised outlier detection in high‐dimensional numerical data , 2012, Stat. Anal. Data Min..

[27]  R. Scheuermann,et al.  Virus Pathogen Database and Analysis Resource (ViPR): A Comprehensive Bioinformatics Database and Analysis Resource for the Coronavirus Research Community , 2012, Viruses.

[28]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[29]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[30]  Wei Lu,et al.  Unsupervised anomaly detection using an evolutionary extension of k-means algorithm , 2008, Int. J. Inf. Comput. Secur..

[31]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[32]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[33]  Lennart Martens,et al.  LNCipedia: a database for annotated human lncRNA transcript sequences and structures , 2012, Nucleic Acids Res..

[34]  Richard Hans Robert Hahnloser,et al.  Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit , 2000, Nature.

[35]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[36]  Joel J. P. C. Rodrigues,et al.  Anomaly detection using baseline and K-means clustering , 2010, SoftCOM 2010, 18th International Conference on Software, Telecommunications and Computer Networks.

[37]  Kishan G. Mehrotra,et al.  Characterization of a Class of Sigmoid Functions with Applications to Neural Networks , 1996, Neural Networks.

[38]  Michael G. Katze,et al.  Middle East respiratory syndrome coronavirus (MERS-CoV) causes transient lower respiratory tract infection in rhesus macaques , 2013, Proceedings of the National Academy of Sciences.