A computational approach to semantic event detection

We propose a three-level video event detection algorithm and apply it to animal hunt detection in wildlife documentaries. The first level extracts texture, color and motion features, and detects motion blobs. The mid-level employs a neural network to verify whether the motion blobs belong to objects of interest. This level also generates shot summaries in terms of intermediate-level descriptors which combine low-level features from the first level and contain results of mid-level, domain specific inferences made on the basis of shot features. The shot summaries are then used by a domain-specific inference process at the third level to detect the video segments that contain events of interest, e.g., hunts. Event based video indexing, summarization and browsing are among the applications of the proposed approach.

[1]  Nuno Vasconcelos,et al.  A Bayesian framework for semantic content characterization , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[2]  Giridharan Iyengar,et al.  Models for automatic classification of video sequences , 1997, Electronic Imaging.

[3]  Dennis Gabor,et al.  Theory of communication , 1946 .

[4]  Nirupam Sarkar,et al.  Improved fractal geometry based texture segmentation technique , 1993 .

[5]  Andrew Lippman,et al.  Models for Automatic Classiication of Video Sequences , 1997 .

[6]  Andrew Heybey,et al.  I/Browse: the Bellcore video library tool kit , 1996, Electronic Imaging.

[7]  M. Smith,et al.  Video Skimming for Quick Browsing based on Audio and Image Characterization , 1995 .

[8]  Stephen W. Smoliar,et al.  Content-based video browsing tools , 1995, Electronic Imaging.

[9]  Marco Ceccarelli,et al.  Visual search in a SMASH system , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[10]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Stephen S. Intille Tracking using a local closed-world assumption : tracking in the football domain , 1994 .

[12]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[13]  Alberto Del Bimbo,et al.  A Spatial Logic for Symbolic Description of Image Contents , 1994, J. Vis. Lang. Comput..

[14]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[15]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[16]  Boon-Lock Yeo,et al.  Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..

[17]  M. Ibrahim Sezan,et al.  A robust real-time face tracking algorithm , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[18]  Sanjeev R. Kulkarni,et al.  Automated analysis and annotation of basketball video , 1997, Electronic Imaging.

[19]  Richard W. Conners,et al.  A Theoretical Comparison of Texture Algorithms , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[21]  Niels da Vitoria Lobo,et al.  Features and Classification Methods to Locate Deciduous Trees in Images , 1999, Comput. Vis. Image Underst..

[22]  Boon-Lock Yeo,et al.  Analysis And Presentation Of Soccer Highlights From Digital Video , 1995 .

[23]  Yoshinao Aoki,et al.  Indexing of baseball telecast for content-based video retrieval , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[24]  Jonathan D. Courtney Automatic video indexing via object motion analysis , 1997, Pattern Recognit..