TAToo: Vision-based Joint Tracking of Anatomy and Tool for Skull-base Surgery

Purpose: Tracking the 3D motion of the surgical tool and the patient anatomy is a fundamental requirement for computer-assisted skull-base surgery. The estimated motion can be used both for intra-operative guidance and for downstream skill analysis. Recovering such motion solely from surgical videos is desirable, as it is compliant with current clinical workflows and instrumentation. Methods: We present Tracker of Anatomy and Tool (TAToo). TAToo jointly tracks the rigid 3D motion of patient skull and surgical drill from stereo microscopic videos. TAToo estimates motion via an iterative optimization process in an end-to-end differentiable form. For robust tracking performance, TAToo adopts a probabilistic formulation and enforces geometric constraints on the object level. Results: We validate TAToo on both simulation data, where ground truth motion is available, as well as on anthropomorphic phantom data, where optical tracking provides a strong baseline. We report sub-millimeter and millimeter inter-frame tracking accuracy for skull and drill, respectively, with rotation errors below 1{\deg}. We further illustrate how TAToo may be used in a surgical navigation setting. Conclusion: We present TAToo, which simultaneously tracks the surgical tool and the patient anatomy in skull-base surgery. TAToo directly predicts the motion from surgical videos, without the need of any markers. Our results show that the performance of TAToo compares favorably to competing approaches. Future work will include fine-tuning of our depth network to reach a 1 mm clinical accuracy goal desired for surgical applications in the skull base.

[1]  M. Unberath,et al.  A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging. , 2023, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[2]  P. Kazanzides,et al.  Fully Immersive Virtual Reality for Skull-base Surgery: Surgical Training and Beyond , 2023, ArXiv.

[3]  Russell H. Taylor,et al.  Twin-S: A Digital Twin for Skull-base Surgery , 2022, 2211.11863.

[4]  Ganesh Venkatesh,et al.  Temporally Consistent Online Depth Estimation in Dynamic Scenes , 2021, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[5]  Qingxu Dou,et al.  Neural Rendering for Stereo 3D Reconstruction of Deformable Tissues in Robotic Surgery , 2022, MICCAI.

[6]  Russell H. Taylor,et al.  SAGE: SLAM with Appearance and Geometry Prior for Endoscopy , 2022, 2022 International Conference on Robotics and Automation (ICRA).

[7]  M. Ishii,et al.  Automated Extraction of Anatomical Measurements From Temporal Bone CT Imaging , 2022, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[8]  Peter Kazanzides,et al.  Virtual reality for synergistic surgical training and data generation , 2021, Comput. methods Biomech. Biomed. Eng. Imaging Vis..

[9]  Bailin Deng,et al.  Fast and Robust Iterative Closest Point , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Dieter Schmalstieg,et al.  Inside-Out Instrument Tracking for Surgical Navigation in Augmented Reality , 2021, VRST.

[11]  J. Siewerdsen,et al.  Automated Registration-Based Temporal Bone Computed Tomography Segmentation for Applications in Neurotologic Surgery , 2021, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[12]  Marc Pollefeys,et al.  Pixel-Perfect Structure-from-Motion with Featuremetric Refinement , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Russell H. Taylor,et al.  E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with Transformer-based Stereoscopic Depth Perception , 2021, MICCAI.

[14]  Jia Deng,et al.  Tangent Space Backpropagation for 3D Transformation Groups , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Jia Deng,et al.  RAFT-3D: Scene Flow using Rigid-Motion Embeddings , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Xingtong Liu,et al.  Revisiting Stereo Depth Estimation From a Sequence-to-Sequence Perspective with Transformers , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  S. Izadi,et al.  HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  L. Nolte,et al.  Evolution and Stagnation of Image Guidance for Surgery in the Lateral Skull: A Systematic Review 1989–2020 , 2021, Frontiers in Surgery.

[19]  M. Nießner,et al.  Neural Non-Rigid Tracking , 2020, NeurIPS.

[20]  Robert G. Radwin,et al.  Modeling Surgical Technical Skill Using Expert Assessment for Automated Computer Rating , 2017, Annals of surgery.

[21]  Richard P. Wildes,et al.  Fast and accurate vision-based stereo reconstruction and motion estimation for image-guided liver surgery , 2018, Healthcare technology letters.

[22]  Alexander Rakhlin,et al.  Automatic Instrument Segmentation in Robot-Assisted Surgery Using Deep Learning , 2018, bioRxiv.

[23]  Vladlen Koltun,et al.  Open3D: A Modern Library for 3D Data Processing , 2018, ArXiv.

[24]  Vladlen Koltun,et al.  Colored Point Cloud Registration Revisited , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Nassir Navab,et al.  Multi-modal imaging, model-based tracking, and mixed reality visualisation for orthopaedic surgery , 2017, Healthcare technology letters.

[26]  U. Mezger,et al.  Navigation in surgery , 2013, Langenbeck's Archives of Surgery.

[27]  Jose Luis Blanco,et al.  A tutorial on SE(3) transformation parameterizations and on-manifold optimization , 2012 .

[28]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[29]  Gerard de Haan,et al.  True-motion estimation using feature correspondences , 2004, IS&T/SPIE Electronic Imaging.

[30]  Paul J. Besl,et al.  Method for registration of 3-D shapes , 1992, Other Conferences.

[31]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .