Reference medical datasets (MosMedData) for independent external evaluation of algorithms based on artificial intelligence in diagnostics

The article describes a novel approach to creating annotated medical datasets for testing artificial intelligence-based diagnostic solutions. Moreover, there are four stages of dataset formation described: planning, selection of initial data, marking and verification, and documentation. There are also examples of datasets created using the described methods. The technique is scalable and versatile, and it can be applied to other areas of medicine and healthcare that are being automated and developed using artificial intelligence and big data technologies.

[1]  A. B. Elizarov,et al.  Methodology and tools for creating training samples for artificial intelligence systems for recognizing lung cancer on CT images , 2020, Health Care of the Russian Federation.

[2]  Jianjiang Feng,et al.  Development and evaluation of an artificial intelligence system for COVID-19 diagnosis , 2020, Nature Communications.

[3]  Jaron J. R. Chong,et al.  Demystification of AI-driven medical image interpretation: past, present and future , 2018, European Radiology.

[4]  N. S. Kulberg,et al.  Clinical acceptance of software based on artificial intelligence technologies (radiology) , 2019, ArXiv.

[5]  T. Barrett,et al.  A head-to-head comparison of the intra- and interobserver agreement of COVID-RADS and CO-RADS grading systems in a population with high estimated prevalence of COVID-19 , 2020, BJR open.

[6]  Simone Sacchi,et al.  Definitions of dataset in the scientific and technical literature , 2010, ASIST.

[7]  S S-L Tan,et al.  Big Data and Analytics in Healthcare , 2015, Methods of Information in Medicine.

[8]  M. Lungren,et al.  Preparing Medical Imaging Data for Machine Learning. , 2020, Radiology.

[9]  S. P. Morozov,et al.  MosMedData: Chest CT Scans with COVID-19 Related Findings , 2020, medRxiv.

[10]  А. Н. Зеленин,et al.  Проблемы ЖКХ и информационные технологии , 2012 .

[11]  Карякин Николай Николаевич,et al.  Некоторые аспекты стандартизации в здравоохранении , 2011 .

[12]  B. Griffith,et al.  Radiology Education in the 21st Century: Threats and Opportunities. , 2019, Journal of the American College of Radiology : JACR.

[13]  Ronald M. Summers,et al.  Medical Image Data and Datasets in the Era of Machine Learning—Whitepaper from the 2016 C-MIMI Meeting Dataset Session , 2017, Journal of Digital Imaging.

[14]  Erik R. Ranschaert,et al.  Artificial Intelligence in Medical Imaging , 2019, Springer International Publishing.

[15]  S. Morozov,et al.  Moscow experiment on computer vision in radiology: involvement and participation of radiologists , 2020 .