Accessing Video Contents : Cooperative Approach between Image and Natural Language Processing

Digital video libraries become much more important. In achieving them, access and extraction methods of semantic contents of videos are essential technologies. The paper demonstrates the benefits of multi-modal video analysis to extract semantic contents of videos. Two systems, Name-It and Spot-It, are introduced as example systems taking this approach. Name-It detects faces in news videos and associates with their names. Spot-It classifies video segments into several meaningful categories. Their results can enhance performance of both retrieval and presentation for digital video libraries. The successful results demonstrate importance of our approach.

[1]  Takeo Kanade,et al.  elligent Access Video: formedia Project , 1996 .

[2]  Takeo Kanade,et al.  Name-It: association of face and name in video , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[4]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[5]  Graham A Stephen,et al.  Approximate String Matching , 1994, Encyclopedia of Algorithms.

[6]  Takeo Kanade,et al.  Human Face Detection in Visual Scenes , 1995, NIPS.

[7]  Takeo Kanade,et al.  Semantic analysis for video contents extraction—spotting by association in news video , 1997, MULTIMEDIA '97.

[8]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[9]  Takeo Kanade,et al.  Name-It: Naming and Detecting Faces in Video by the Integration of Image and Natural Language Processing , 1997, IJCAI.