You Only Learn Once: Universal Anatomical Landmark Detection

Detecting anatomical landmarks in medical images plays an essential role in understanding the anatomy and planning automated processing. In recent years, a variety of deep neural network methods have been developed to detect landmarks automatically. However, all of those methods are unary in the sense that a highly specialized network is trained for a single task say associated with a particular anatomical region. In this work, for the first time, we investigate the idea of ”You Only Learn Once (YOLO)” and develop a universal anatomical landmark detection model to realize multiple landmark detection tasks with endto-end training based on mixed datasets. The model consists of a local network and a global network: The local network is built upon the idea of universal UNet to learn multi-domain local features and the global network is a parallellyduplicated sequential of dilated convolutions that extract global features to further disambiguate the landmark locations. It is worth mentioning that the new model design requires fewer parameters than models with standard convolutions to train. We evaluate our YOLO model on three X-ray datasets of 1,588 images on the head, hand, and chest, collectively contributing 62 landmarks. The experimental results show that our proposed universal model behaves largely better than any previous models trained on multiple datasets. It even beats the performance of the model that is trained separately for every single dataset.

[1]  Paul A. Bromiley,et al.  Robust and Accurate Shape Model Matching Using Random Forest Regression-Voting , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[3]  Bostjan Likar,et al.  Shape Representation for Efficient Landmark-Based Segmentation in 3-D , 2014, IEEE Transactions on Medical Imaging.

[4]  J. Chiras,et al.  [Percutaneous vertebral surgery. Technics and indications]. , 1997, Journal of neuroradiology. Journal de neuroradiologie.

[5]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  Chao Huang,et al.  3D U2-Net: A 3D Universal U-Net for Multi-Domain Medical Image Segmentation , 2019, MICCAI.

[7]  Sotirios A. Tsaftaris,et al.  Medical Image Computing and Computer Assisted Intervention , 2017 .

[8]  Timothy F. Cootes,et al.  A benchmark for comparison of dental radiography analysis algorithms , 2016, Medical Image Anal..

[9]  Clement J. McDonald,et al.  Lung Segmentation in Chest Radiographs Using Anatomical Atlases With Nonrigid Registration , 2014, IEEE Transactions on Medical Imaging.

[10]  Martin Urschler,et al.  From Local to Global Random Regression Forests: Exploring Anatomical Landmark Localization , 2016, MICCAI.

[11]  Claudia Lindner,et al.  Robust and Accurate Shape Model Matching Using Random Forest Regression-Voting. , 2015, IEEE transactions on pattern analysis and machine intelligence.

[12]  Ronald M. Summers,et al.  A Review of Deep Learning in Medical Imaging: Imaging Traits, Technology Trends, Case Studies With Progress Highlights, and Future Promises , 2020, Proceedings of the IEEE.

[13]  Leslie N. Smith,et al.  Cyclical Learning Rates for Training Neural Networks , 2015, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[14]  S. Kevin Zhou,et al.  Bounding Maps for Universal Lesion Detection , 2020, MICCAI.

[15]  Dorin Comaniciu,et al.  Search strategies for multiple landmark detection by submodular maximization , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[17]  Daguang Xu,et al.  Deep Image-to-Image Recurrent Network with Shape Basis Learning for Automatic Vertebra Labeling in Large-Scale 3D CT Volumes , 2017, MICCAI.

[18]  Chunfeng Lian,et al.  Multi-task Dynamic Transformer Network for Concurrent Bone Segmentation and Large-Scale Landmark Localization with Dental CBCT , 2020, MICCAI.

[19]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[20]  Shaohua Kevin Zhou,et al.  Shape regression machine and efficient segmentation of left ventricle endocardium from 2D B-mode echocardiogram , 2010, Medical Image Anal..

[21]  Martin Urschler,et al.  Integrating geometric configuration and appearance information into a unified framework for anatomical landmark localization , 2018, Medical Image Anal..

[22]  Clement J. McDonald,et al.  Automatic Tuberculosis Screening Using Chest Radiographs , 2014, IEEE Transactions on Medical Imaging.

[23]  Nathan Lay,et al.  Rapid Multi-organ Segmentation Using Context Integration and Discriminative Models , 2013, IPMI.

[24]  Christian Payer,et al.  Integrating spatial configuration into heatmap regression based CNNs for landmark localization , 2019, Medical Image Anal..

[25]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[26]  S. Kevin Zhou,et al.  Miss the Point: Targeted Adversarial Attack on Multiple Landmark Detection , 2020, MICCAI.

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  Thomas Lange,et al.  3D ultrasound-CT registration of the liver using combined landmark-intensity information , 2008, International Journal of Computer Assisted Radiology and Surgery.

[29]  Bulat Ibragimov,et al.  Segmentation of Pathological Structures by Landmark-Assisted Deformable Models , 2017, IEEE Transactions on Medical Imaging.