Semantic Video Model for Description, Detection and Retrieval of Visual Events. (Modèle sémantique de la vidéo pour la description, la détection et la recherche des événements visuels)

This thesis attempts to realize three advances in the domain of semantic multimedia, which can be defined as the application of semantic (Web) techniques to multimedia resources. The first contribution concerns the generation of high-level (semantic) descriptions: indeed as the extraction of low-level multimedia features was largely studied and solved in the image analysis community, the automatic generation of high-level descriptions from low level features is still an open problem. In this thesis, a high level description language for defining events and objects from low-level features is proposed as an attempt to narrow this "semantic gap". The second contribution concerns the reasoning for semantic multimedia. Indeed as semantic web languages have been conceived for the description of all kinds of resources, they may lack special features for tackling and reasoning on multimedia resources, that contain especially spatial and temporal information, and whose interpretation often contains a part of uncertainty. The second contribution of this thesis addresses this problem by proposing a semantic language for videos based on fuzzy conceptual graphs, and by providing the corresponding reasoning procedures. The third contribution concerns multimedia databases and the indexing and retrieval techniques for semantic multimedia. Indeed a query language for semantic multimedia should allow a user to express spatiotemporal constraints as well as the semantic constraints, and a query engine should be able to process theses constraints together. The third contribution in this thesis consists in proposing a datalog-like query language for expressing spatiotemporal and semantic queries, and a constraint solving and reasoning procedure for answering to these queries