ClassMiner: mining medical video for scalable skimming and summarization

1. SYSTEM TECHNICAL DESCRIPTION The ClassMiner system demonstrates a fully implemented tool for scalable video skimming and summarization. The key technology in the system is the integrated medical video content structure and events mining process, which was presented in a paper at the SIGMOD workshop on Data Mining and Knowledge Discovery [1]. As the system architecture in Fig. 1 indicates, we first apply a general video shot segmentation and key-frame selection scheme to parse the video stream into physical units. Then, the video group detection, scene detection and clustering strategies are executed to mine the video content structure. Various visual and audio feature processing techniques are utilized to detect some semantic cues, such as slides, face and speaker changes, etc. within the video, and these detection results are joined together to mine three types of events (presentation, dialog, clinical operation) from the detected video scenes. Finally, a scalable video skimming and summarization tool is constructed based on the mined video content structure and event information to help the user visualize and access video content.