Scalable gastroscopic video summarization via similar-inhibition dictionary selection

OBJECTIVE This paper aims at developing an automated gastroscopic video summarization algorithm to assist clinicians to more effectively go through the abnormal contents of the video. METHODS AND MATERIALS To select the most representative frames from the original video sequence, we formulate the problem of gastroscopic video summarization as a dictionary selection issue. Different from the traditional dictionary selection methods, which take into account only the number and reconstruction ability of selected key frames, our model introduces the similar-inhibition constraint to reinforce the diversity of selected key frames. We calculate the attention cost by merging both gaze and content change into a prior cue to help select the frames with more high-level semantic information. Moreover, we adopt an image quality evaluation process to eliminate the interference of the poor quality images and a segmentation process to reduce the computational complexity. RESULTS For experiments, we build a new gastroscopic video dataset captured from 30 volunteers with more than 400k images and compare our method with the state-of-the-arts using the content consistency, index consistency and content-index consistency with the ground truth. Compared with all competitors, our method obtains the best results in 23 of 30 videos evaluated based on content consistency, 24 of 30 videos evaluated based on index consistency and all videos evaluated based on content-index consistency. CONCLUSIONS For gastroscopic video summarization, we propose an automated annotation method via similar-inhibition dictionary selection. Our model can achieve better performance compared with other state-of-the-art models and supplies more suitable key frames for diagnosis. The developed algorithm can be automatically adapted to various real applications, such as the training of young clinicians, computer-aided diagnosis or medical report generation.

[1]  Andreas Uhl,et al.  Computer-assisted pit-pattern classification in different wavelet domains for supporting dignity assessment of colonic polyps , 2009, Pattern Recognit..

[2]  Nikolaos G. Bourbakis,et al.  Detection of Small Bowel Polyps and Ulcers in Wireless Capsule Endoscopy Videos , 2011, IEEE Transactions on Biomedical Engineering.

[3]  Yunhui Liu,et al.  Diversified Key-Frame Selection Using Structured ${L_{2,1}}$ Optimization , 2014, IEEE Transactions on Industrial Informatics.

[4]  Xiao Liu,et al.  Joint shot boundary detection and key frame extraction , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[5]  Michail G. Lagoudakis,et al.  A decision support system to facilitate management of patients with acute gastrointestinal bleeding , 2008, Artif. Intell. Medicine.

[6]  Alain Rakotomamonjy,et al.  Scattering features for lung cancer detection in fibered confocal fluorescence microscopy images , 2014, Artif. Intell. Medicine.

[7]  Michal Mackiewicz,et al.  Wireless Capsule Endoscopy Color Video Segmentation , 2008, IEEE Transactions on Medical Imaging.

[8]  Georgios Tziritas,et al.  Equivalent Key Frames Selection Based on Iso-Content Principles , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Max Q.-H. Meng,et al.  Computer-aided small bowel tumor detection for capsule endoscopy , 2011, Artif. Intell. Medicine.

[10]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Sung Wook Baik,et al.  Efficient visual attention based framework for extracting key frames from videos , 2013, Signal Process. Image Commun..

[12]  M. Färkkilä,et al.  Fecal calprotectin and S100A12 have low utility in prediction of small bowel Crohn's disease detected by wireless capsule endoscopy , 2012, Scandinavian journal of gastroenterology.

[13]  Yoshito Mekada,et al.  Detecting Informative Frames from Wireless Capsule Endoscopic Video Using Color and Texture Features , 2008, MICCAI.

[14]  A. Jemal,et al.  Global Cancer Statistics , 2011 .

[15]  Mateu Sbert,et al.  Browsing and exploration of video sequences: A new scheme for key frame extraction and 3D visualization using entropy based Jensen divergence , 2014, Inf. Sci..

[16]  Emilio Corchado,et al.  WeVoS-ViSOM: An ensemble summarization algorithm for enhanced data visualization , 2012, Neurocomputing.

[17]  Yueting Zhuang,et al.  Topic aspect-oriented summarization via group selection , 2015, Neurocomputing.

[18]  Seong-Dae Kim,et al.  Iterative key frame selection in the rate-constraint environment , 2003, Signal Process. Image Commun..

[19]  Max Q.-H. Meng,et al.  Computer-Aided Detection of Bleeding Regions for Capsule Endoscopy Images , 2009, IEEE Transactions on Biomedical Engineering.

[20]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[21]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[22]  Yong Jae Lee,et al.  Discovering important people and objects for egocentric video summarization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[24]  Nikolaos G. Bourbakis,et al.  Three-Dimensional Reconstruction of the Digestive Wall in Capsule Endoscopy Videos Using Elastic Video Interpolation , 2011, IEEE Transactions on Medical Imaging.

[25]  Vibhav Vineet,et al.  Efficient Salient Region Detection with Soft Image Abstraction , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  A. Uhl,et al.  Computer-Aided Decision Support Systems for Endoscopy in the Gastrointestinal Tract: A Review , 2011, IEEE Reviews in Biomedical Engineering.

[27]  Guillermo Sapiro,et al.  See all by looking at a few: Sparse modeling for finding representative objects , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Dimitris K. Iakovidis,et al.  Enabling distributed summarization of wireless capsule endoscopy video , 2010, 2010 IEEE International Conference on Imaging Systems and Techniques.

[29]  Boon-Lock Yeo,et al.  Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..

[30]  Max Q.-H. Meng,et al.  Tumor Recognition in Wireless Capsule Endoscopy Images Using Textural Features and SVM-Based Feature Selection , 2012, IEEE Transactions on Information Technology in Biomedicine.

[31]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[32]  Dimitrios K. Iakovidis,et al.  Reduction of capsule endoscopy reading times by unsupervised image mining , 2010, Comput. Medical Imaging Graph..

[33]  Jiebo Luo,et al.  Towards Extracting Semantically Meaningful Key Frames From Personal Video Clips: From Humans to Computers , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Meng Wang,et al.  Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification , 2012, IEEE Transactions on Multimedia.

[35]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[36]  Artur Klepaczko,et al.  Texture and color based image segmentation and pathology detection in capsule endoscopy videos , 2014, Comput. Methods Programs Biomed..

[37]  Jianping Fan,et al.  Image collection summarization via dictionary learning for sparse representation , 2013, Pattern Recognit..

[38]  Jiebo Luo,et al.  Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection , 2012, IEEE Transactions on Multimedia.

[39]  Noboru Babaguchi,et al.  Video Summarization for Large Sports Video Archives , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[40]  Mrityunjay Kumar,et al.  Key frame extraction from consumer videos using sparse representation , 2011, 2011 18th IEEE International Conference on Image Processing.

[41]  Behzad Shahraray,et al.  Automatic generation of pictorial transcripts of video programs , 1995, Electronic Imaging.

[42]  Huang-Chia Shih,et al.  A Novel Attention-Based Key-Frame Determination Method , 2013, IEEE Transactions on Broadcasting.

[43]  Pietro Valdastri,et al.  Image partitioning and illumination in image-based pose detection for teleoperated flexible endoscopes , 2013, Artif. Intell. Medicine.

[44]  Harry W. Agius,et al.  Video summarisation: A conceptual framework and survey of the state of the art , 2008, J. Vis. Commun. Image Represent..

[45]  A. Murat Tekalp,et al.  Automatic Soccer Video Analysis and Summarization , 2003, IS&T/SPIE Electronic Imaging.

[46]  Chun-Rong Huang,et al.  Helicobacter Pylori-Related Gastric Histology Classification Using Support-Vector-Machine-Based Feature Selection , 2008, IEEE Transactions on Information Technology in Biomedicine.

[47]  Joo-Hwee Lim,et al.  Epitomized Summarization of Wireless Capsule Endoscopic Videos for Efficient Visualization , 2010, MICCAI.

[48]  Guozheng Yan,et al.  Bleeding Detection in Wireless Capsule Endoscopy Based on Probabilistic Neural Network , 2011, Journal of Medical Systems.

[49]  Chao Chen,et al.  Within and Between Shot Information Utilisation in Video Key Frame Extraction , 2011, J. Inf. Knowl. Manag..

[50]  Ioannis Pitas,et al.  Information theory-based shot cut/fade detection and video summarization , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[51]  Richard Jiang,et al.  Hierarchical video summarisation in reference frame subspace , 2009 .

[52]  Chun-Ming Lai,et al.  News Video Summarization Based on Spatial and Motion Feature Analysis , 2004, PCM.

[53]  Loris Nanni,et al.  Local binary patterns variants as texture descriptors for medical image analysis , 2010, Artif. Intell. Medicine.

[54]  Ahmed Z. Emam,et al.  Endoscopy video summarization based on unsupervised learning and feature discrimination , 2013, 2013 Visual Communications and Image Processing (VCIP).

[55]  Christophe De Vleeschouwer,et al.  Formulating Team-Sport Video Summarization as a Resource Allocation Problem , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[56]  Nicoletta Dessì,et al.  Enhancing Random Forests Performance in Microarray Data Classification , 2013, AIME.

[57]  Yue Gao,et al.  A video summarization tool using two-level redundancy detection for personal video recorders , 2008, IEEE Transactions on Consumer Electronics.

[58]  Andrea Cavallaro,et al.  Resource Allocation for Personalized Video Summarization , 2014, IEEE Transactions on Multimedia.

[59]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[60]  Krzysztof Krawiec,et al.  Evolutionary weighting of image features for diagnosing of CNS tumors , 2000, Artif. Intell. Medicine.

[61]  Youssef Hadi,et al.  Video summarization by k-medoid clustering , 2006, SAC '06.

[62]  Baoxin Li,et al.  Extracting key frames from consumer videos using bi-layer group sparsity , 2011, MM '11.

[63]  Christian Daul,et al.  Graph based construction of textured large field of view mosaics for bladder cancer diagnosis , 2012, Pattern Recognit..

[64]  Zhe-Ming Lu,et al.  Video abstraction based on the visual attention model and online clustering , 2013, Signal Process. Image Commun..

[65]  Alan Hanjalic,et al.  Shot-boundary detection: unraveled and resolved? , 2002, IEEE Trans. Circuits Syst. Video Technol..