Two-stage U-Net++ for Medical Image Segmentation

Convolutional neural networks (CNNs) have achieved expert-level performance in many image processing applications. However, CNNs face the vanishing gradient problem when the number of layers are increased beyond a certain threshold. In this paper, a new two-stage U-Net++ (TS-UNet++) architecture is proposed to address the vanishing gradient problem. The new architecture uses two different types of deep CNNs rather than a traditional multi-stage network, the U-Net++ and U-Net architectures in the first and second stages respectively. An extra convolutional block is added before the output layer of the multi-stage network to better extract high-level features. A new concatenation-based fusion structure is incorporated in this architecture to enable deep supervision. More convolutional layers are added after each concatenation of the fusion structure to extract more representative features. The performance of the proposed method is compared with the U-Net, U-Net++ and two-stage U-Net (TS-UNet) architectures for the problem of segmenting neck muscles in a clinical MRI dataset. The architectures were evaluated using the dice similarity coefficient (DSC) and directed Hausdorff distance (DHD) measures and the results demonstrate the superior performance of the new architecture.