NOVEL APPROACHES TO DEVELOPMENT OF ARTIFICIAL INTELLIGENCE ALGORITHMS IN THE LUNG CANCER DIAGNOSTICS

The relevance of developing an intelligent automated diagnostic system (IADS) for lung cancer (LC) detection stems from the social significance of this disease and its leading position among all cancer diseases. Theoretically, the use of IADS is possible at a stage of screening as well as at a stage of adjusted diagnosis of LC. The recent approaches to training the IADS do not take into account the clinical and radiological classification as well as peculiarities of the LC clinical forms, which are used by the medical community. This defines difficulties and obstacles of using the available IADS. The authors are of the opinion that the closeness of a developed IADS to the «doctor’s logic» contributes to a better reproducibility and interpretability of the IADS usage results. Most IADS described in the literature have been developed on the basis of neural networks, which have several disadvantages that affect reproducibility when using the system. This paper proposes a composite algorithm using machine learning methods such as Deep Forest and Siamese neural network, which can be regarded as a more efficient approach for dealing with a small amount of training data and optimal from the reproducibility point of view. The open datasets used for training IADS include annotated objects which in some cases are not confirmed morphologically. The paper provides a description of the LIRA dataset developed by using the diagnostic results of St. Petersburg Clinical Research Center of Specialized Types of Medical Care (Oncology), which includes only computed tomograms of patients with the verified diagnosis. The paper considers stages of the machine learning process on the basis of the shape features, of the internal structure features as well as a new developed system of differential diagnosis of LC based on the Siamese neural networks. A new approach to the feature dimension reduction is also presented in the paper, which aims more efficient and faster learning of the system.

[1]  Ji Feng,et al.  Deep Forest: Towards An Alternative to Deep Neural Networks , 2017, IJCAI.

[2]  Syed Omer Gilani,et al.  An appraisal of nodules detection techniques for lung cancer in CT images , 2018, Biomed. Signal Process. Control..

[3]  Richard C. Pais,et al.  The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans. , 2011, Medical physics.

[4]  Shiqian Ma,et al.  Highly accurate model for prediction of lung nodule malignancy with CT scans , 2018, Scientific Reports.

[5]  Lev V. Utkin,et al.  A Siamese Deep Forest , 2017, Knowl. Based Syst..

[6]  K. Doi,et al.  Current status and future potential of computer-aided diagnosis in medical imaging. , 2005, The British journal of radiology.

[7]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[8]  Ricardo A. M. Valentim,et al.  Computer-aided detection system for lung cancer in computed tomography scans: Review and future prospects , 2014, BioMedical Engineering OnLine.

[9]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[10]  Валентина Михайловна Моисеенко,et al.  АВТОМАТИЗИРОВАННАЯ СИСТЕМА ОБНАРУЖЕНИЯ ОБЪЕМНЫХ ОБРАЗОВАНИЙ В ЛЕГКИХ КАК ЭТАП РАЗВИТИЯ ИСКУССТВЕННОГО ИНТЕЛЛЕКТА В ДИАГНОСТИКЕ РАКА ЛЕГКОГО , 2018 .

[11]  Max Bramer,et al.  Principles of Data Mining , 2013, Undergraduate Topics in Computer Science.

[12]  Anthony P. Reeves,et al.  Three-dimensional segmentation and growth-rate estimation of small pulmonary nodules in helical CT images , 2003, IEEE Transactions on Medical Imaging.