Machine and deep learning for workflow recognition during surgery

Abstract Recent years have seen tremendous progress in artificial intelligence (AI), such as with the automatic and real-time recognition of objects and activities in videos in the field of computer vision. Due to its increasing digitalization, the operating room (OR) promises to directly benefit from this progress in the form of new assistance tools that can enhance the abilities and performance of surgical teams. Key for such tools is the recognition of the surgical workflow, because efficient assistance by an AI system requires this system to be aware of the surgical context, namely of all activities taking place inside the operating room. We present here how several recent techniques relying on machine and deep learning can be used to analyze the activities taking place during surgery, using videos captured from either endoscopic or ceiling-mounted cameras. We also present two potential clinical applications that we are developing at the University of Strasbourg with our clinical partners.

[1]  Russell H. Taylor,et al.  Surgical data science for next-generation interventions , 2017, Nature Biomedical Engineering.

[2]  Christoph H. Lampert,et al.  Learning Intelligent Dialogs for Bounding Box Annotation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Nicolas Padoy,et al.  See It With Your Own Eyes: Markerless Mobile Augmented Reality for Radiation Awareness in the Hybrid Room , 2017, IEEE Transactions on Biomedical Engineering.

[4]  Gwénolé Quellec,et al.  Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks , 2018, Medical Image Anal..

[5]  Danail Stoyanov,et al.  SurReal: enhancing Surgical simulation Realism using style transfer , 2018, BMVC.

[6]  Klaus Schöffmann,et al.  Temporal segmentation of laparoscopic videos into surgical phases , 2016, 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI).

[7]  Andru Putra Twinanda,et al.  RSDNet: Learning to Predict Remaining Surgery Duration from Laparoscopic Videos Without Manual Annotations , 2018, IEEE Transactions on Medical Imaging.

[8]  Nicolai Schoch,et al.  Surgical Data Science: Enabling Next-Generation Surgery , 2017, ArXiv.

[9]  Nassir Navab,et al.  Statistical modeling and recognition of surgical workflow , 2012, Medical Image Anal..

[10]  Peter Fu-Ming Hu,et al.  Real-Time Identification of Operating Room State from Video , 2007, AAAI.

[11]  Didier Mutter,et al.  Weakly-Supervised Learning for Tool Localization in Laparoscopic Videos , 2018, CVII-STENT/LABELS@MICCAI.

[12]  Pierre Jannin,et al.  A Framework for the Recognition of High-Level Surgical Tasks From Video Images for Cataract Surgeries , 2012, IEEE Transactions on Biomedical Engineering.

[13]  Ariel Roguin,et al.  Brain and neck tumors among physicians performing interventional procedures. , 2013, The American journal of cardiology.

[14]  Rüdiger Dillmann,et al.  Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis , 2017, ArXiv.

[15]  Rüdiger Dillmann,et al.  Knowledge-Driven Formalization of Laparoscopic Surgeries for Rule-Based Intraoperative Context-Aware Assistance , 2014, IPCAI.

[16]  Douglas A Wiegmann,et al.  Disruptions in surgical flow and their relationship to surgical errors: an exploratory investigation. , 2007, Surgery.

[17]  Nicolas Padoy,et al.  A global radiation awareness system using augmented reality and Monte Carlo simulations , 2018 .

[18]  Nicolas Padoy,et al.  A Multi-view RGB-D Approach for Human Pose Estimation in Operating Rooms , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19]  Nassir Navab,et al.  Learning-based Surgical Workflow Detection from Intra-Operative Signals , 2017, ArXiv.

[20]  Randall S. Burd,et al.  Video based activity recognition in trauma resuscitation , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[21]  Guy Rosman,et al.  Machine learning and coresets for automated real-time video segmentation of laparoscopic and robot-assisted surgery , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Andru Putra Twinanda,et al.  Data-driven spatio-temporal RGBD feature encoding for action recognition in operating rooms , 2015, International Journal of Computer Assisted Radiology and Surgery.

[23]  Gregory D. Hager,et al.  System events: readily accessible features for surgical phase detection , 2016, International Journal of Computer Assisted Radiology and Surgery.

[24]  Andru Putra Twinanda,et al.  EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos , 2016, IEEE Transactions on Medical Imaging.

[25]  Didier Mutter,et al.  Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos , 2018, International Journal of Computer Assisted Radiology and Surgery.

[26]  Didier Mutter,et al.  Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition , 2018, ArXiv.

[27]  Nicolas Padoy,et al.  MVOR: A Multi-view RGB-D Operating Room Dataset for 2D and 3D Human Pose Estimation , 2018, ArXiv.

[28]  Gregory D. Hager,et al.  Surgical Phase Recognition: from Instrumented ORs to Hospitals Around the World , 2016 .

[29]  Andru Putra Twinanda,et al.  Multi-Stream Deep Architecture for Surgical Phase Recognition on Multi-View RGBD Videos , 2017 .

[30]  Guang-Zhong Yang,et al.  Episode Classification for the Analysis of Tissue/Instrument Interaction with Multiple Visual Cues , 2003, MICCAI.

[31]  Gaurav Yengera,et al.  Less is More: Surgical Phase Recognition with Less Annotations through Self-Supervised Pre-training of CNN-LSTM Networks , 2018, ArXiv.

[32]  Nicolas Padoy,et al.  A generalizable approach for multi-view 3D human pose regression , 2018, Machine Vision and Applications.

[33]  Nassir Navab,et al.  Automatic feature generation in endoscopic images , 2008, International Journal of Computer Assisted Radiology and Surgery.

[34]  Filip Vanhavere,et al.  Recommendations to reduce extremity and eye lens doses in interventional radiology and cardiology , 2011 .

[35]  Daochang Liu,et al.  Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification , 2018, MICCAI.

[36]  Germain Forestier,et al.  Automatic phase prediction from low-level surgical activities , 2015, International Journal of Computer Assisted Radiology and Surgery.

[37]  Gwénolé Quellec,et al.  Real-time recognition of surgical tasks in eye surgery videos , 2014, Medical Image Anal..

[38]  Yasuo Sakurai,et al.  Surgical Workflow Monitoring Based on Trajectory Data Mining , 2010, JSAI-isAI Workshops.

[39]  Chi-Wing Fu,et al.  SV-RCNet: Workflow Recognition From Surgical Videos Using Recurrent Convolutional Network , 2018, IEEE Transactions on Medical Imaging.

[40]  Nicolas Padoy,et al.  Articulated clinician detection using 3D pictorial structures on RGB‐D data , 2016, Medical Image Anal..