Role of biological Data Mining and Machine Learning Techniques in Detecting and Diagnosing the Novel Coronavirus (COVID-19): A Systematic Review

Coronaviruses (CoVs) are a large family of viruses that are common in many animal species, including camels, cattle, cats and bats. Animal CoVs, such as Middle East respiratory syndrome-CoV, severe acute respiratory syndrome (SARS)-CoV, and the new virus named SARS-CoV-2, rarely infect and spread among humans. On January 30, 2020, the International Health Regulations Emergency Committee of the World Health Organisation declared the outbreak of the resulting disease from this new CoV called ‘COVID-19’, as a ‘public health emergency of international concern’. This global pandemic has affected almost the whole planet and caused the death of more than 315,131 patients as of the date of this article. In this context, publishers, journals and researchers are urged to research different domains and stop the spread of this deadly virus. The increasing interest in developing artificial intelligence (AI) applications has addressed several medical problems. However, such applications remain insufficient given the high potential threat posed by this virus to global public health. This systematic review addresses automated AI applications based on data mining and machine learning (ML) algorithms for detecting and diagnosing COVID-19. We aimed to obtain an overview of this critical virus, address the limitations of utilising data mining and ML algorithms, and provide the health sector with the benefits of this technique. We used five databases, namely, IEEE Xplore, Web of Science, PubMed, ScienceDirect and Scopus and performed three sequences of search queries between 2010 and 2020. Accurate exclusion criteria and selection strategy were applied to screen the obtained 1305 articles. Only eight articles were fully evaluated and included in this review, and this number only emphasised the insufficiency of research in this important area. After analysing all included studies, the results were distributed following the year of publication and the commonly used data mining and ML algorithms. The results found in all papers were discussed to find the gaps in all reviewed papers. Characteristics, such as motivations, challenges, limitations, recommendations, case studies, and features and classes used, were analysed in detail. This study reviewed the state-of-the-art techniques for CoV prediction algorithms based on data mining and ML assessment. The reliability and acceptability of extracted information and datasets from implemented technologies in the literature were considered. Findings showed that researchers must proceed with insights they gain, focus on identifying solutions for CoV problems, and introduce new improvements. The growing emphasis on data mining and ML techniques in medical fields can provide the right environment for change and improvement.

[1]  Yi Fan,et al.  Bat Coronaviruses in China , 2019, Viruses.

[2]  Zhicong Yang,et al.  The SARS-CoV-2 outbreak: What we know , 2020, International Journal of Infectious Diseases.

[3]  Isra Al-Turaiki,et al.  Building predictive models for MERS-CoV infections using data mining techniques , 2016, Journal of Infection and Public Health.

[4]  Sungroh Yoon,et al.  Large-scale machine learning of media outlets for understanding public reactions to nation-wide viral infection outbreaks , 2017, Methods.

[5]  Heba Kurdi,et al.  Selecting Accurate Classifier Models for a MERS-CoV Dataset , 2018, IntelliSys.

[6]  Sandeep K. Sood,et al.  An intelligent system for predicting and preventing MERS-CoV infection outbreak , 2015, The Journal of Supercomputing.

[7]  Isaac S Kohane,et al.  Artificial Intelligence in Healthcare , 2019, Artificial Intelligence and Machine Learning for Business for Non-Engineers.

[8]  P. Daszak,et al.  Fatal swine acute diarrhoea syndrome caused by an HKU2-related coronavirus of bat origin , 2018, Nature.

[9]  Maya John,et al.  Main factors influencing recovery in MERS Co-V patients using machine learning , 2019, Journal of Infection and Public Health.

[10]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. , 2010, International journal of surgery.

[11]  Jesse M. Ehrenfeld,et al.  The Role of Augmented Intelligence (AI) in Detecting and Preventing the Spread of Novel Coronavirus , 2020, Journal of Medical Systems.

[12]  Taeseon Yoon,et al.  Comparison between SARS CoV and MERS CoV Using Apriori Algorithm, Decision Tree, SVM , 2016 .

[13]  Christian Drosten,et al.  Identification of a novel coronavirus in patients with severe acute respiratory syndrome. , 2003, The New England journal of medicine.

[14]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[15]  Heba Kurdia,et al.  Identifying accurate classifier models for a text-based MERS-CoV dataset , 2017, 2017 Intelligent Systems Conference (IntelliSys).