Medical image segmentation automatic quality control: A multi-dimensional approach

In clinical applications, using erroneous segmentations of medical images can have dramatic consequences. Current approaches dedicated to medical image segmentation automatic quality control do not predict segmentation quality at slice-level (2D), resulting in sub-optimal evaluations. Our 2D-based deep learning method simultaneously performs quality control at 2D-level and 3D-level for cardiovascular MR image segmentations. We compared it with 3D approaches by training both on 36,540 (2D) / 3842 (3D) samples to predict Dice Similarity Coefficients (DSC) for 4 different structures from the left ventricle, i.e., trabeculations (LVT), myocardium (LVM), papillary muscles (LVPM) and blood (LVC). The 2D-based method outperformed the 3D method. At the 2D-level, the mean absolute errors (MAEs) of the DSC predictions for 3823 samples, were 0.02, 0.02, 0.05 and 0.02 for LVM, LVC, LVT and LVPM, respectively. At the 3D-level, for 402 samples, the corresponding MAEs were 0.02, 0.01, 0.02 and 0.04. The method was validated in a clinical practice evaluation against semi-qualitative scores provided by expert cardiologists for 1016 subjects of the UK BioBank. Finally, we provided evidence that a multi-level QC could be used to enhance clinical measurements derived from image segmentations.