Video annotation for content-based retrieval using human behavior analysis and domain knowledge

This paper proposes the automatic annotation of sports video for content-based retrieval. Conventional methods using position information of objects such as locus, relative positions, their transitions, etc., as indices, have drawbacks that tracking errors of a certain object due to occlusions causes recognition failures, and that representation by position information essentially has a limited number of recognizable events in the retrieval. Our approach incorporates human behavior analysis and specific domain knowledge with conventional methods, to develop an integrated reasoning module for richer expressiveness of events and robust recognition. Based on the proposed method, we implemented a content-based retrieval system which can identify several actions on real tennis video. We select court and net lines, players' positions, ball positions, and players' actions, as indices. Court and net lines are extracted using a court model and Hough transforms. Players and ball positions are tracked by adaptive template matching and particular predictions against sudden changes of motion direction. Players' actions are analyzed by 2D appearance-based matching using the transition of players' silhouettes and a hidden Markov model. The results using two sets of tennis video is presented, demonstrating the performance and the validity of our approach.

[1]  Anil K. Jain,et al.  Automatic classification of tennis video for high-level content-based retrieval , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[2]  R. Manmatha,et al.  Retrieving images by similarity of visual appearance , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[3]  Kazuyoshi Yoshino,et al.  Qualitative image analysis of group behaviour , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Yoshiaki Shirai,et al.  Tracking players and a ball in soccer games , 1999, Proceedings. 1999 IEEE/SICE/RSJ. International Conference on Multisensor Fusion and Integration for Intelligent Systems. MFI'99 (Cat. No.99TH8480).

[5]  HongJiang Zhang,et al.  Automatic parsing of TV soccer programs , 1995, Proceedings of the International Conference on Multimedia Computing and Systems.

[6]  Sanjeev R. Kulkarni,et al.  Automated analysis and annotation of basketball video , 1997, Electronic Imaging.

[7]  Hideo Hashimoto,et al.  Video indexing using motion vectors , 1992, Other Conferences.

[8]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[9]  Akio Nagasaka,et al.  Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.