论文信息 - Video retrieval and summarization

Video retrieval and summarization

This year, it is anticipated that 25% of the population of the wealthy countries will have a digital television camera at their disposal. The combined capacity to generate bits from these devices is astronomical. In addition, the growth in computer speed, disc capacity, and, most of all, the rapid growth of the Internet and WWWwill make this information accessible worldwide. The immediate question is what to do with all the information. One could store the digital video information on tapes, CD-ROMs, DVDs, or any such device but the level of access would be less than the well-known shoe boxes filled with tapes, old photographs, and letters. We need to ensure that the techniques for organizing video stay in tune with the tremendous amounts of information. So, with video on demand about to arrive, there is an urgent need for effective video retrieval and summarization methods. Creating access to still images had appeared to be a hard problem. It requires hard work, precise modeling, the inclusion of considerable amounts of a priori knowledge, and solid experimentation to analyze the contents of a photograph. Even though video tends to be much larger than images, it can be argued that the access to video is a simpler problem than access to still images. First of all, video comes in color and color provides easy clues to object geometry, position of the light, and identification of objects by pixel patterns, only at the expense of having to handle three times more data than black and white. And, video comes as a sequence, so what moves together most likely forms an entity in real life, so segmentation of video is intrinsically simpler than of a still image, again at the expense of only more data to handle. That does not mean progress will come for free. Moving from images to video adds several orders of complexity to the retrieval problem due to indexing, analysis, and browsing over the inherently temporal aspect of video. For example, the user can pose a similarity based query of ‘‘Find a video scene similar to this one.’’ Responding to such a query requires representations of the image and of the temporal aspects of the video scene. Furthermore, higher level representations which reflect the structure of the constituent video shots or semantic temporal information such as gestures could also aid in retrieving the right video scene. A consequence of the growing consumer demand for visual information is that sophisticated technology is needed for representing, modeling, indexing, and retrieving multimedia data. In particular, we need robust techniques to index/retrieve and compress visual information, new scalable browsing algorithms allowing access to

Nicu Sebe | Arnold W. M. Smeulders | Michael S. Lew

[1] Borivoje Furht,et al. Handbook on Multimedia Computing , 1998 .

[2] Andrew Zisserman,et al. Automated location matching in movies , 2003, Comput. Vis. Image Underst..

[4] DimitrovaNevenka,et al. Applications of Video-Content Analysis and Retrieval , 2002 .

[5] Jenq-Neng Hwang,et al. Object-based analysis and interpretation of human motion in sports video sequences by dynamic bayesian networks , 2003, Comput. Vis. Image Underst..

[6] Alberto Del Bimbo,et al. Semantic annotation of soccer videos: automatic highlights identification , 2003, Comput. Vis. Image Underst..

[7] Nicu Sebe,et al. Video Indexing and Understanding , 2001, Principles of Visual Information Retrieval.

[8] Marcus Jerome Pickering,et al. Evaluation of key frame-based retrieval techniques for video , 2003, Comput. Vis. Image Underst..

[9] Anil C. Kokaram,et al. Content Based Analysis for Video from Snooker Broadcasts , 2002, CIVR.

[10] Yan Liu,et al. Fast video segment retrieval by Sort-Merge feature selection, boundary refinement, and lazy evaluation , 2003, Comput. Vis. Image Underst..

[11] Feng Liu,et al. 3D motion retrieval with motion index tree , 2003, Comput. Vis. Image Underst..

[12] Boon-Lock Yeo,et al. Video query: Research directions , 1998, IBM J. Res. Dev..

[13] Michael S. Lew,et al. Principles of Visual Information Retrieval , 2001, Advances in Pattern Recognition.

[14] Yanjun Qi,et al. Video Classification and Retrieval with the Informedia Digital Video Library System , 2002, TREC.

[15] Marcel Worring,et al. Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[16] Alan Hanjalic,et al. Shot-boundary detection: unraveled and resolved? , 2002, IEEE Trans. Circuits Syst. Video Technol..