Deep learning for patient‐specific quality assurance: Identifying errors in radiotherapy delivery by radiomic analysis of gamma images with convolutional neural networks

PURPOSE Patient-specific quality assurance (QA) for intensity-modulated radiation therapy (IMRT) is a ubiquitous clinical procedure, but conventional methods have often been criticized as being insensitive to errors or less effective than other common physics checks. Recently, there has been interest in the application of radiomics, quantitative extraction of image features, to radiotherapy QA. In this work, we investigate a deep learning approach to classify the presence or absence of introduced radiotherapy treatment delivery errors from patient-specific QA. METHODS Planar dose maps from 186 IMRT beams from 23 IMRT plans were evaluated. Each plan was transferred to a cylindrical phantom CT geometry. Three sets of planar doses were exported from each plan corresponding to (a) the error-free case, (b) a random multileaf collimator (MLC) error case, and (c) a systematic MLC error case. Each plan was delivered to the electronic portal imaging device (EPID), and planned and measured doses were used to calculate gamma images in an EPID dosimetry software package (for a total of 558 gamma images). Two radiomic approaches were used. In the first, a convolutional neural network with triplet learning was used to extract image features from the gamma images. In the second, a handcrafted approach using texture features was used. The resulting metrics from both approaches were input into four machine learning classifiers (support vector machines, multilayer perceptrons, decision trees, and k-nearest-neighbors) in order to determine whether images contained the introduced errors. Two experiments were considered: the two-class experiment classified images as error-free or containing any MLC error, and the three-class experiment classified images as error-free, containing a random MLC error, or containing a systematic MLC error. Additionally, threshold-based passing criteria were calculated for comparison. RESULTS In total, 303 gamma images were used for model training and 255 images were used for model testing. The highest classification accuracy was achieved with the deep learning approach, with a maximum accuracy of 77.3% in the two-class experiment and 64.3% in the three-class experiment. The performance of the handcrafted approach with texture features was lower, with a maximum accuracy of 66.3% in the two-class experiment and 53.7% in the three-class experiment. Variability between the results of the four machine learning classifiers was lower for the deep learning approach vs the texture feature approach. Both radiomic approaches were superior to threshold-based passing criteria. CONCLUSIONS Deep learning with convolutional neural networks can be used to classify the presence or absence of introduced radiotherapy treatment delivery errors from patient-specific gamma images. The performance of the deep learning network was superior to a handcrafted approach with texture features, and both radiomic approaches were better than threshold-based passing criteria. The results suggest that radiomic QA is a promising direction for clinical radiotherapy.

[1]  Mateusz Buda,et al.  Deep learning in radiology: An overview of the concepts and a survey of the state of the art with focus on MRI , 2018, Journal of magnetic resonance imaging : JMRI.

[2]  W Art Chaovalitwongse,et al.  Error Detection in Intensity-Modulated Radiation Therapy Quality Assurance Using Radiomic Analysis of Gamma Distributions. , 2018, International journal of radiation oncology, biology, physics.

[3]  Timothy D. Solberg,et al.  Deep nets vs expert designed features in medical physics: An IMRT QA case study , 2018, Medical physics.

[4]  C. Pal,et al.  Deep Learning: A Primer for Radiologists. , 2017, Radiographics : a review publication of the Radiological Society of North America, Inc.

[5]  David S Followill,et al.  Treatment Planning System Calculation Errors Are Present in Most Imaging and Radiation Oncology Core-Houston Phantom Failures. , 2017, International journal of radiation oncology, biology, physics.

[6]  Chi Keong Goh,et al.  Deep Ordinal Regression Based on Data Relationship for Small Datasets , 2017, IJCAI.

[7]  D. Followill,et al.  Radiotherapy deficiencies identified during on-site dosimetry visits by the IROC Houston QA Center , 2017 .

[8]  Dong Liu,et al.  Multi-Scale Triplet CNN for Person Re-Identification , 2016, ACM Multimedia.

[9]  T D Solberg,et al.  A mathematical framework for virtual IMRT QA using machine learning. , 2016, Medical physics.

[10]  So-Yeon Park,et al.  A machine learning approach to the accurate prediction of multi-leaf collimator positional errors , 2016, Physics in medicine and biology.

[11]  Paul Kinahan,et al.  Radiomics: Images Are More than Pictures, They Are Data , 2015, Radiology.

[12]  E. Ford,et al.  Quantifying the performance of in vivo portal dosimetry in detecting four types of treatment parameter variations. , 2015, Medical physics.

[13]  Jinzhong Yang,et al.  Measuring Computed Tomography Scanner Variability of Radiomics Features , 2015, Investigative radiology.

[14]  Fei Yang,et al.  Quantitative radiomics: impact of stochastic effects on textural feature analysis implies the need for standards , 2015, Journal of medical imaging.

[15]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[18]  J. Pouliot,et al.  Use of TrueBeam developer mode for imaging QA. , 2015 .

[19]  David S Followill,et al.  Institutional patient-specific IMRT QA does not predict unacceptable plan delivery. , 2014, International journal of radiation oncology, biology, physics.

[20]  D. Followill,et al.  MO-G-BRE-02: A Survey of IMRT QA Practices for More Than 800 Institutions. , 2014, Medical physics.

[21]  Jesús Angulo,et al.  Advanced Statistical Matrices for Texture Characterization: Application to Cell Classification , 2014, IEEE Transactions on Biomedical Engineering.

[22]  Alejandra Rangel,et al.  ROC analysis in patient specific quality assurance. , 2013, Medical physics.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Sasa Mutic,et al.  Quality control quantification (QCQ): a tool to measure the value of quality control checks in radiation oncology. , 2012, International journal of radiation oncology, biology, physics.

[25]  M. Hatt,et al.  Reproducibility of Tumor Uptake Heterogeneity Characterization Through Textural Feature Analysis in 18F-FDG PET , 2012, The Journal of Nuclear Medicine.

[26]  Jan-Jakob Sonke,et al.  In aqua vivo EPID dosimetry. , 2011, Medical physics.

[27]  Benjamin E Nelms,et al.  Per-beam, planar IMRT QA passing rates do not predict clinically relevant patient dose errors. , 2011, Medical physics.

[28]  R. Jeraj,et al.  Variability of textural features in FDG PET images due to different acquisition modes and reconstruction parameters , 2010, Acta oncologica.

[29]  Jon J Kruse,et al.  On the insensitivity of single field planar dosimetry to IMRT inaccuracies. , 2010, Medical physics.

[30]  Jan-Jakob Sonke,et al.  A simple backprojection algorithm for 3D in vivo EPID dosimetry of IMRT treatments. , 2009, Medical physics.

[31]  Guanghua Yan,et al.  On the sensitivity of patient‐specific IMRT QA to MLC positioning errors , 2009, Journal of applied clinical medical physics.

[32]  M. V. van Herk,et al.  Accurate two-dimensional IMRT verification using a back-projection EPID dosimetry method. , 2006, Medical physics.

[33]  D. Low,et al.  A technique for the quantitative evaluation of dose distributions. , 1998, Medical physics.

[34]  Robert King,et al.  Textural features corresponding to textural properties , 1989, IEEE Trans. Syst. Man Cybern..

[35]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..