A deep learning- and partial least square regression-based model observer for a low-contrast lesion detection task in CT.

PURPOSE This work aims to develop a new framework of image quality assessment using deep learning-based model observer (DL-MO) and to validate it in a low-contrast lesion detection task that involves CT images with patient anatomical background. METHODS The DL-MO was developed using the transfer learning strategy to incorporate a pretrained deep convolutional neural network (CNN), a partial least square regression discriminant analysis (PLS-DA) model and an internal noise component. The CNN was previously trained to achieve the state-of-the-art classification accuracy over a natural image database. The earlier layers of the CNN were used as a deep feature extractor, with the assumption that similarity exists between the CNN and the human visual system. The PLSR model was used to further engineer the deep feature for the lesion detection task in CT images. The internal noise component was incorporated to model the inefficiency and variability of human observer (HO) performance, and to generate the ultimate DL-MO test statistics. Seven abdominal CT exams were retrospectively collected from the same type of CT scanners. To compare DL-MO with HO, 12 experimental conditions with varying lesion size, lesion contrast, radiation dose, and reconstruction types were generated, each condition with 154 trials. CT images of a real liver metastatic lesion were numerically modified to generate lesion models with four lesion sizes (5, 7, 9, and 11 mm) and three contrast levels (15, 20, and 25 HU). The lesions were inserted into patient liver images using a projection-based method. A validated noise insertion tool was used to synthesize CT exams with 50% and 25% of routine radiation dose level. CT images were reconstructed using the weighted filtered back projection algorithm and an iterative reconstruction algorithm. Four medical physicists performed a two-alternative forced choice (2AFC) detection task (with multislice scrolling viewing mode) on patient images across the 12 experimental conditions. DL-MO was operated on the same datasets. Statistical analyses were performed to evaluate the correlation and agreement between DL-MO and HO. RESULTS A statistically significant positive correlation was observed between DL-MO and HO for the 2AFC low-contrast detection task that involves patient liver background. The corresponding Pearson product moment correlation coefficient was 0.986 [95% confidence interval (0.950, 0.996)]. Bland-Altman agreement analysis did not indicate statistically significant differences. CONCLUSIONS The proposed DL-MO is highly correlated with HO in a low-contrast detection task that involves realistic patient liver background. This study demonstrated the potential of the proposed DL-MO to assess image quality directly based on patient images in realistic, clinically relevant CT tasks.

[1]  Jonathan Baxter,et al.  Theoretical Models of Learning to Learn , 1998, Learning to Learn.

[2]  Kyle J. Myers,et al.  Model observers for assessment of image quality , 1993 .

[3]  F R Verdun,et al.  Estimation of the noisy component of anatomical backgrounds. , 1999, Medical physics.

[4]  L. Carrascal,et al.  Partial least squares regression as an alternative to current regression methods used in ecology , 2009 .

[5]  Shuai Leng,et al.  Correlation between model observer and human observer performance in CT imaging when lesion location is uncertain. , 2013, Medical physics.

[6]  Roman Rosipal,et al.  Overview and Recent Advances in Partial Least Squares , 2005, SLSFS.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shuai Leng,et al.  Prediction of human observer performance in a 2-alternative forced choice low-contrast detection task using channelized Hotelling observer: impact of radiation dose and reconstruction algorithms. , 2013, Medical physics.

[9]  C. J. Kotre,et al.  The effect of background structure on the detection of low contrast objects in mammography. , 1998, The British journal of radiology.

[10]  Damien Racine,et al.  Anthropomorphic model observer performance in three-dimensional detection task for low-contrast computed tomography , 2016, Journal of medical imaging.

[11]  M. Shiung,et al.  Development and Validation of a Practical Lower-Dose-Simulation Tool for Optimizing Computed Tomography Scan Protocols , 2012, Journal of computer assisted tomography.

[12]  Anne-Laure Boulesteix,et al.  Partial least squares: a versatile tool for the analysis of high-dimensional genomic data , 2006, Briefings Bioinform..

[13]  Matthew A Kupinski,et al.  Correlation between a 2D channelized Hotelling observer and human observers in a low‐contrast detection task with multislice reading in CT , 2017, Medical physics.

[14]  H H Barrett,et al.  Human- and model-observer performance in ramp-spectrum noise: effects of regularization and object variability. , 2001, Journal of the Optical Society of America. A, Optics, image science, and vision.

[15]  Ehsan Samei,et al.  A methodology for image quality evaluation of advanced CT systems. , 2013, Medical physics.

[16]  F O Bochud,et al.  Image quality in CT: From physical measurements to model observers. , 2015, Physica medica : PM : an international journal devoted to the applications of physics to medicine and biology : official journal of the Italian Association of Biomedical Physics.

[17]  R. Brereton,et al.  Partial least squares discriminant analysis: taking the magic away , 2014 .

[18]  Mark A. Anastasio,et al.  Learning the ideal observer for SKE detection tasks by use of convolutional neural networks (Cum Laude Poster Award) , 2018, Medical Imaging.

[19]  Yi Zhang,et al.  Correlation between human and model observer performance for discrimination task in CT , 2014, Physics in medicine and biology.

[20]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[21]  Ehsan Samei,et al.  An Improved Index of Image Quality for Task-based Performance of CT Iterative Reconstruction across Three Commercial Implementations. , 2015, Radiology.

[22]  Grace J Gang,et al.  Task-based detectability in CT image reconstruction by filtered backprojection and penalized likelihood estimation. , 2014, Medical physics.

[23]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[24]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[25]  Jong Chul Ye,et al.  Deep Convolutional Framelets: A General Deep Learning Framework for Inverse Problems , 2017, SIAM J. Imaging Sci..

[26]  Yi Zhang,et al.  Degradation of CT Low-Contrast Spatial Resolution Due to the Use of Iterative Reconstruction and Reduced Dose Levels. , 2015, Radiology.

[27]  Baiyu Chen,et al.  Technical Note: Insertion of digital lesions in the projection domain for dual‐source, dual‐energy CT , 2017, Medical physics.

[28]  Arthur E. Burgess Evaluation of detection model performance in power-law noise , 2001, SPIE Medical Imaging.

[29]  Premkumar Elangovan,et al.  A deep learning model observer for use in alterative forced choice virtual clinical trials , 2018, Medical Imaging.

[30]  Giuseppe Palermo,et al.  Performance of PLS regression coefficients in selecting variables for each response of a multivariate PLS for omics-type data , 2009, Advances and applications in bioinformatics and chemistry : AABC.

[31]  A. J. Morris,et al.  Data augmentation: an alternative approach to the analysis of spectroscopic data , 1998 .

[32]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[33]  Adam Wunderlich,et al.  Image covariance and lesion detectability in direct fan-beam x-ray computed tomography , 2008, Physics in medicine and biology.

[34]  Danh V. Nguyen,et al.  Tumor classification by partial least squares using microarray gene expression data , 2002, Bioinform..

[35]  Xiang Li,et al.  Predictive models for observer performance in CT: applications in protocol optimization , 2011, Medical Imaging.

[36]  K. Stierstorfer,et al.  Weighted FBP--a simple approximate 3D FBP algorithm for multislice spiral CT with good dose usage for arbitrary pitch. , 2004, Physics in medicine and biology.

[37]  Ehsan Samei,et al.  Correlation between human detection accuracy and observer model-based image quality metrics in computed tomography , 2016, Journal of medical imaging.

[38]  Richard Simon,et al.  Bias in error estimation when using cross-validation for model selection , 2006, BMC Bioinformatics.

[39]  Timothée Masquelier,et al.  Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition , 2015, Scientific Reports.

[40]  Mark D. McDonnell,et al.  Understanding Data Augmentation for Classification: When to Warp? , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[41]  Shuai Leng,et al.  Assessment of Low-Contrast Resolution for the American College of Radiology Computed Tomographic Accreditation Program: What Is the Impact of Iterative Reconstruction? , 2015, Journal of computer assisted tomography.

[42]  Francesc Massanes,et al.  Evaluation of CNN as anthropomorphic model observer , 2017, Medical Imaging.

[43]  Wenjiang J. Fu,et al.  Estimating misclassification error with small samples via bootstrap cross-validation , 2005, Bioinform..

[44]  Felix K. Kopp,et al.  CNN as model observer in a liver lesion detection task for x‐ray computed tomography: A phantom study , 2018, Medical physics.

[45]  T. Næs,et al.  Ensemble methods and partial least squares regression , 2004 .

[46]  Ehsan Samei,et al.  Assessment of the dose reduction potential of a model-based iterative reconstruction algorithm using a task-based performance metrology. , 2014, Medical physics.