Analysis of video content using statistical spatio-temporal models
暂无分享,去创建一个
In this thesis we develop statistical spatio-temporal modeling framework for the automatic understanding and structuring of the content of videos. The framework is applied to echocardiogram videos, which are one of the widely used modalities of diagnostic imaging. A digital library of the structured echocardiogram videos is created. The contents of the structured echocardiogram videos are summarized and augmented by contextual information. The innovative claims of this research are the following: (1) Developed an approach for multi-class object recognition under ambiguity resulting from model and data distortion factors. The novelty of the approach is the fact that it uses the collection of the generative models of the different object classes to project objects into an anchor space, which essentially fuses the assessments of the object by all models. Discriminative methods are then used to classify the object in the anchor space. The approach is shown to be effective in disambiguating the confusion in object classification. (2) A novel framework for modeling the multi-phase activity pattern of objects represented by the collection of their parts is proposed. In this framework the idea of pictorial structure, which is a method for modeling the appearance of the individual parts and their spatial relationships to the other parts in the representation of the object, is extended to the time domain. Inferencing and learning methods are provided for the model. The proposed framework is shown to be effective in the experiments. (3) A framework for categorizing different activity patterns of an object is proposed. The novelty of the approach is that only a statistical spatio-temporal model for the normal activity of the object is learned from training exemplars and all other forms of the activity of the object are assessed with respect to this model. Fisher score mapping is employed to obtain the assessments of each activity pattern with respect to the model of the normal activity. The work presented in this thesis is the first work on automatic understanding and structuring of the content of the echocardiogram videos.