Hypothesis selection for scene interpretation using grammatical models of scene evolution

A major bottleneck in dynamic scene interpretation is the search that is required through a database to find a model that best matches the observed data. We show that the problem can be alleviated if the object model selection is controlled by a scene evolution model. We adopt a grammatical model to characterise objects and events in a dynamic scene which can be used to generate visual expectations within a particular context. The object hypotheses can be accepted without further search of the database provided a measure of the goodness of fit of the match between the selected model and the visual data falls below a threshold. In this paper we present experiments for determining the necessary thresholds for the model hypotheses testing using the recognition method described by Yang et al. (1994), as well as for assessing the subsequent performance of the scene interpretation system with and without the constraining grammar.