i3PosNet: instrument pose estimation from X-ray in temporal bone surgery

Purpose Accurate estimation of the position and orientation (pose) of surgical instruments is crucial for delicate minimally invasive temporal bone surgery. Current techniques lack in accuracy and/or line-of-sight constraints (conventional tracking systems) or expose the patient to prohibitive ionizing radiation (intra-operative CT). A possible solution is to capture the instrument with a c-arm at irregular intervals and recover the pose from the image. Methods i3PosNet infers the position and orientation of instruments from images using a pose estimation network. Said framework considers localized patches and outputs pseudo-landmarks. The pose is reconstructed from pseudo-landmarks by geometric considerations. Results We show i3PosNet reaches errors $$<\,0.05$$ < 0.05  mm. It outperforms conventional image registration-based approaches reducing average and maximum errors by at least two thirds. i3PosNet trained on synthetic images generalizes to real X-rays without any further adaptation. Conclusion The translation of deep learning-based methods to surgical applications is difficult, because large representative datasets for training and testing are not available. This work empirically shows sub-millimeter pose estimation trained solely based on synthetic training data.

[1]  Nassir Navab,et al.  Enabling machine learning in X-ray-based procedures via realistic simulation of image formation , 2019, International Journal of Computer Assisted Radiology and Surgery.

[2]  Peter Kazanzides,et al.  Intraoperative Image-based Multiview 2D/3D Registration for Image-Guided Orthopaedic Surgery: Incorporation of Fiducial-Based C-Arm Tracking and GPU-Acceleration , 2012, IEEE Transactions on Medical Imaging.

[3]  Antony J. Hodgson,et al.  A deep learning framework for segmentation and pose estimation of pedicle screw implants based on C-arm fluoroscopy , 2018, International Journal of Computer Assisted Radiology and Surgery.

[4]  Thomas Klenzner,et al.  Evaluation von minimal invasiven multi-port Zugängen der Otobasis am humanen Schädelpräparat , 2014, CURAC.

[5]  Stefan Wesarg,et al.  Quantitative Analysis of Marker Segmentation for C-Arm Pose Based Navigation , 2014 .

[6]  Michael J Ackerman,et al.  Engineering and algorithm design for an image processing Api: a technical report on ITK--the Insight Toolkit. , 2002, Studies in health technology and informatics.

[7]  R. Balachandran,et al.  Percutaneous cochlear implant drilling via customized frames: An in vitro study , 2010, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[8]  Tobias Ortmaier,et al.  Temporal bone borehole accuracy for cochlear implantation influenced by drilling strategy: an in vitro study , 2014, International Journal of Computer Assisted Radiology and Surgery.

[9]  Satoshi Kondo,et al.  CATARACTS: Challenge on automatic tool annotation for cataRACT surgery , 2019, Medical Image Anal..

[10]  J. Webster Stayman,et al.  Known-component 3D-2D registration for image guidance and quality assurance in spine surgery pedicle screw placement , 2015, Medical Imaging.

[11]  Michael A. Speidel,et al.  Real-time pose estimation of devices from x-ray images: Application to x-ray/echo registration for cardiac interventions , 2016, Medical Image Anal..

[12]  Robert Schmitt,et al.  High-precision evaluation of electromagnetic tracking , 2019, International Journal of Computer Assisted Radiology and Surgery.

[13]  Alejandro F. Frangi,et al.  Medical Image Computing and Computer Assisted Intervention – MICCAI 2018 , 2018, Lecture Notes in Computer Science.

[14]  N. Galatsanos,et al.  Multiple-image radiography. , 2003, Physics in medicine and biology.

[15]  Pascal Fua,et al.  Simultaneous Recognition and Pose Estimation of Instruments in Minimally Invasive Surgery , 2017, MICCAI.

[16]  Marco Caversaccio,et al.  Accuracy and feasibility of a dedicated image guidance solution for endoscopic lateral skull base surgery , 2018, European Archives of Oto-Rhino-Laryngology.

[17]  Z. Jane Wang,et al.  A CNN Regression Approach for Real-Time 2D/3D Registration , 2016, IEEE Transactions on Medical Imaging.

[18]  Zheng Zhang,et al.  FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[19]  Daniel F. García,et al.  Robot Guidance Using Machine Vision Techniques in Industrial Environments: A Comparative Review , 2016, Sensors.

[20]  Nathalie Harder,et al.  An Objective Comparison of Cell Tracking Algorithms , 2017, Nature Methods.

[21]  Thomas Brox,et al.  Joint Graph Decomposition & Node Labeling: Problem, Algorithms, Applications , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Nassir Navab,et al.  X-ray-transform Invariant Anatomical Landmark Detection for Pelvic Trauma Surgery , 2018, MICCAI.

[23]  Wolfgang Birkfellner,et al.  Electromagnetic Tracking in Medicine—A Review of Technology, Validation, and Applications , 2014, IEEE Transactions on Medical Imaging.

[24]  Yue Zhang,et al.  Task Driven Generative Modeling for Unsupervised Domain Adaptation: Application to X-ray Image Segmentation , 2018, MICCAI.

[25]  Ramya Balachandran,et al.  Minimally invasive image‐guided cochlear implantation surgery: First report of clinical implementation , 2014, The Laryngoscope.

[26]  Stefan Wesarg,et al.  Marker detection evaluation by phantom and cadaver experiments for C-arm pose estimation pattern , 2013, Medical Imaging.

[27]  Nicolai Schoch,et al.  Surgical Data Science: Enabling Next-Generation Surgery , 2017, ArXiv.

[28]  Anirban Mukhopadhyay,et al.  Instrument Pose Estimation Using Registration for Otobasis Surgery , 2018, WBIR.

[29]  Nassir Navab,et al.  X-Ray PoseNet: 6 DoF Pose Estimation for Mobile X-Ray Devices , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[30]  Marco Caversaccio,et al.  Estimation of Tool Pose Based on Force–Density Correlation During Robotic Drilling , 2013, IEEE Transactions on Biomedical Engineering.

[31]  Cilgia Dür,et al.  A Neuromonitoring Approach to Facial Nerve Preservation During Image-guided Robotic Cochlear Implantation , 2016, Otology & neurotology : official publication of the American Otological Society, American Neurotology Society [and] European Academy of Otology and Neurotology.

[32]  Gabor Fichtinger,et al.  FTRAC-A robust fluoroscope tracking fiducial. , 2005, Medical physics.

[33]  Marco Caversaccio,et al.  Robotic cochlear implantation: surgical procedure and first clinical experience , 2017, Acta oto-laryngologica.

[34]  Stefan Wesarg,et al.  Image Processing, Computer Vision, Pattern Recognition, and Graphics , 2016 .

[35]  Guang-Zhong Yang,et al.  Robust guidewire tracking under large deformations combining segment‐like features (SEGlets) , 2017, Medical Image Anal..

[36]  Guang-Zhong Yang,et al.  Real-time surgical tool tracking and pose estimation using a hybrid cylindrical marker , 2017, International Journal of Computer Assisted Radiology and Surgery.

[37]  Mathias Unberath,et al.  CAI4CAI: The Rise of Contextual Artificial Intelligence in Computer-Assisted Interventions , 2019, Proceedings of the IEEE.

[38]  Nassir Navab,et al.  DeepDRR - A Catalyst for Machine Learning in Fluoroscopy-guided Procedures , 2018, MICCAI.

[39]  Takeji Sakae,et al.  Novel real-time tumor-contouring method using deep learning to prevent mistracking in X-ray fluoroscopy , 2017, Radiological Physics and Technology.

[40]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Yann Nguyen,et al.  Minimally Invasive Computer-Assisted Approach for Cochlear Implantation , 2011, Surgical innovation.

[42]  Simon Rit,et al.  The Reconstruction Toolkit (RTK), an open-source cone-beam CT reconstruction toolkit based on the Insight Toolkit (ITK) , 2014 .

[43]  Roberto Cipolla,et al.  PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[44]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[45]  Bostjan Likar,et al.  A review of 3D/2D registration methods for image-guided interventions , 2012, Medical Image Anal..

[46]  Kawal S. Rhode,et al.  Rapid Image Registration of Three-Dimensional Transesophageal Echocardiography and X-ray Fluoroscopy for the Guidance of Cardiac Interventions , 2010, IPCAI.

[47]  Thomas Klenzner,et al.  Navigation as a quality management tool in cochlear implant surgery , 2004, The Journal of Laryngology & Otology.

[48]  Georgios Sakas,et al.  Minimally Invasive Multiport Surgery of the Lateral Skull Base , 2014, BioMed research international.

[49]  Graeme P. Penney,et al.  Standardized evaluation methodology for 2-D-3-D registration , 2005, IEEE Transactions on Medical Imaging.

[50]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[51]  Derek D. Lichti,et al.  A fast, accurate and closed-form method for pose recognition of an intramedullary nail using a tracked C-arm , 2015, International Journal of Computer Assisted Radiology and Surgery.

[52]  Alan J. Koffron,et al.  Evaluation of 300 Minimally Invasive Liver Resections at a Single Institution: Less Is More , 2007, Annals of surgery.

[53]  Georgios Sakas,et al.  Planning nonlinear access paths for temporal bone surgery , 2018, International Journal of Computer Assisted Radiology and Surgery.

[54]  Max A. Viergever,et al.  A survey of medical image registration - under review , 2016, Medical Image Anal..