Shot boundary detection in endoscopic surgery videos using a variational Bayesian framework

PurposeOver the last decade, the demand for content management of video recordings of surgical procedures has greatly increased. Although a few research methods have been published toward this direction, the related literature is still in its infancy. In this paper, we address the problem of shot detection in endoscopic surgery videos, a fundamental step in content-based video analysis.MethodsThe video is first decomposed into short clips that are processed sequentially. After feature extraction, we employ spatiotemporal Gaussian mixture models (GMM) for each clip and apply a variational Bayesian (VB) algorithm to approximate the posterior distribution of the model parameters. The proper number of components is handled automatically by the VBGMM algorithm. The estimated components are matched along the video sequence via their Kullback–Leibler divergence. Shot borders are defined when component tracking fails, signifying a different visual appearance of the surgical scene.ResultsExperimental evaluation was performed on laparoscopic videos containing a variable number of shots. Performance was measured via precision, recall, coverage and overflow metrics. The proposed method was compared with GMM and a shot detection method based on spatiotemporal motion differences (MotionDiff). The results demonstrate that VBGMM has higher performance than all other methods for most assessment metrics: precision and recall >80 %, coverage: 84 %. Overflow for VBGMM was worse than MotionDiff (37 vs. 27 %).ConclusionsThe proposed method generated promising results for shot border detection. Spatiotemporal modeling via VBGMMs provides a means to explore additional applications such as component tracking.

[1]  Guang-Zhong Yang,et al.  Content-Based Surgical Workflow Representation Using Probabilistic Motion Modeling , 2010, MIAR.

[2]  Constantinos Loukas,et al.  Smoke detection in endoscopic surgery videos: a first step towards retrieval of semantic events , 2015, The international journal of medical robotics + computer assisted surgery : MRCAS.

[3]  Rajesh Aggarwal,et al.  Development, feasibility, validity, and reliability of a scale for objective assessment of operative performance in laparoscopic gastric bypass surgery. , 2013, Journal of the American College of Surgeons.

[4]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[5]  Pierre Jannin,et al.  Automatic knowledge-based recognition of low-level tasks in ophthalmological procedures , 2012, International Journal of Computer Assisted Radiology and Surgery.

[6]  Otthein Herzog,et al.  Automatic Shot Boundary Detection Combining Color, Edge, and Motion Features of Adjacent Frames , 2004, TRECVID.

[7]  Nassir Navab,et al.  Modeling and Segmentation of Surgical Workflow from Laparoscopic Video , 2010, MICCAI.

[8]  Jung-Hwan Oh,et al.  Informative frame classification for endoscopy video , 2007, Medical Image Anal..

[9]  Nuno Vasconcelos,et al.  Endoscopic image analysis in semantic space , 2012, Medical Image Anal..

[10]  Yu Cao,et al.  A Visual Model Approach for Parsing Colonoscopy Videos , 2004, CIVR.

[11]  David R Farley,et al.  Do You See What I See? How We Use Video as an Adjunct to General Surgery Resident Education. , 2015, Journal of surgical education.

[12]  László Böszörményi,et al.  State-of-the-art and future challenges in video scene detection: a survey , 2013, Multimedia Systems.

[13]  Andru Putra Twinanda,et al.  Classification approach for automatic laparoscopic video database organization , 2015, International Journal of Computer Assisted Radiology and Surgery.

[14]  Xinbo Gao,et al.  A Video Shot Boundary Detection Algorithm Based on Feature Tracking , 2006, RSKT.

[15]  R. Priya,et al.  A comprehensive review of significant researches on content based indexing and retrieval of visual information , 2013, Frontiers of Computer Science.

[16]  Guang-Zhong Yang,et al.  Probabilistic Tracking of Affine-Invariant Anisotropic Regions , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Li Li,et al.  A Survey on Visual Content-Based Video Indexing and Retrieval , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[18]  Klaus Schöffmann,et al.  Segmentation of recorded endoscopic videos by detecting significant motion changes , 2013, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI).

[19]  Mathias Lux,et al.  Endoscopic Video Retrieval: A Signature-Based Approach for Linking Endoscopic Images with Video Segments , 2015, 2015 IEEE International Symposium on Multimedia (ISM).

[20]  Constantinos Loukas,et al.  Performance comparison of various feature detector‐descriptors and temporal models for video‐based assessment of laparoscopic skills , 2016, The international journal of medical robotics + computer assisted surgery : MRCAS.

[21]  N. Nikolaidis,et al.  Video shot detection and condensed representation. a review , 2006, IEEE Signal Processing Magazine.

[22]  Constantinos Loukas,et al.  Surgical workflow analysis with Gaussian mixture multivariate autoregressive (GMMAR) models: a simulation study , 2013, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[23]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[24]  Alan Hanjalic,et al.  Shot-boundary detection: unraveled and resolved? , 2002, IEEE Trans. Circuits Syst. Video Technol..

[25]  Klaus Schöffmann,et al.  Relevance Segmentation of Laparoscopic Videos , 2013, 2013 IEEE International Symposium on Multimedia.

[26]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[27]  Adrian Corduneanu,et al.  Variational Bayesian Model Selection for Mixture Distributions , 2001 .

[28]  Mathias Lux,et al.  A novel tool for summarization of arthroscopic videos , 2009, Multimedia Tools and Applications.

[29]  Klaus Schöffmann,et al.  Keyframe extraction in endoscopic video , 2015, Multimedia Tools and Applications.

[30]  Andru Putra Twinanda,et al.  Fisher Kernel Based Task Boundary Retrieval in Laparoscopic Database with Single Video Query , 2014, MICCAI.